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(57) 



ABSTRACT 



A determination can be made of bow changes to underlying 
data affect the value of objects. Examples of applications 
are: caching dynamic Web pages; chent-server applications 
whereby a server sending objects (which are changing all the 
time) to multiple clients can track which versions are sent to 
which clients and how obsolete the veisions are; and any 
situation where it is necessary to maintain and uniquely 
identify several versions of objects, update obsolete objects, 
quantitatively assess how different two versions of the same 
object are, and/or maintain consistency among a set of 
objects. A directed graph called an object dependence graph, 
may be used to represent the data dependencies between 
objects. Another aspect is constructing and maintaining 
objects to associate changes in remote data with^ cached 
objec ts. If data in a remote data sou rce changesy jatabase 
j^cfeangelaoSScations are used to "tri g ger^ dynamic rebuild 
of Ussociillttl Objects, llius, obsoleteo5jects can be dynami- 
cally replaced with fresh objects. The objects can be com- 
plex objects, such as dynamic Web pages or compound- 
complex objects, and the data can be underlying data in a 
database. The update can include either storing a new 
version of the object in the cache; or deleting an object from 
the cache. Caches on multiple servers can also be synchro- 
nized with the data in a single common database. Updated 
information, whether new pages or delete orders, can be 
broadcast to a set of server nodes, permitting many systems 
to simultaneously benefit from the advantages of prefetching 
and providing a high degree of scaleability. 

16 Claims, 47 Drawing Sheets 
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SCALEABLE METHOD FOR MAINTAINING Conference Proceedings, December 1996, pp. 119-133; and 

AND MAKING CONSISTENT UPDATES TO "World-Wide Web Proxies", A. Luotonen and K. Altis, in 

CACHES Computer-Networks and ISDN Systems, vol. 27 (1994) pp. 

147-154). One of the problems with most proxy caches on 

CROSS REFERENCE TO RELATED s the Web today is that there is no way to determine if pages 

APPLICATION in the caches arc obsolete. For this reason, most proxy 

r«. |. . T^- • • 1 rTTc 1- caches do not store dynamic pages. The present invention 

This apphcation is a Divisional of U.S. application Scr. , . . . ^ ^ KJ" *^ - , , r 

no/nncn'^c a,, i mm * • ^ - solvcs this problem and provides a powerful method for 

No, 08/905,225, filed Aug. 1, 1997. The present invention is „■*•• * • n.^i.j 

related to U.S. patent application Ser. No. 08/905,114, filed ^^^"^g "^T a^FT^^ ^""^ "^^'T "^'^'^ ''''' 

of even date herewith, entided: "Determining How Changes ™^^P^*^ ""'^'^ distributed across a network, 

to Underlying Data Affect Cached Objects," by Challenger ^^"^ ^ ^ a method and system for 

et al., now U.S. Pat. No. 6,026,413. This application, which automatically detecting changes in the underlying data and 

is commonly assigned with the present invention to the efficienUy replacing objects dependent on that data in one or 

International Business Machines Coqjoration, Armonk, ^^^^^^ as the primary mechanism for cache mainte- 

N.Y, is hereby incorporated herein by reference in its nance. The present invenUon addresses such a need. Existing 

entirety. cache invalidation schemes typically involve some variant 

of (a) aging, in which items which have not been referenced 

BACKGROUND OF THE INVENTION within some period of time are removed from cache, and (b) 

forceful deletion of items known to be obsolete. 

1. Field of the Invention 20 a -j ui . r i i. l j - 

A considerable amotmt of work has been done m the area 

Hie present invention is related to an improved data of cache coherence for shared-memory multiprocessors (see 

processing system. ParUcular a^ects relate to the World "Computer Architecture: A Quantitative Approach" by J. 

Wide Web, databases, and transaction processing systems. A Hennessy and D. Patterson, Morgan Kaufmann Publishers, 

more particular a^ect is related to the caching of dynamic i^c, 1996). In shared-memory multiprocessors, no caches 

documents on the World Wide Web. 25 to contain obsolete values. For example, sup- 

2. Related Art pose the variable x-99 is stored in caches belonging to 
Complex objects can be expensive and time-consuming to processors pi, p2, and p3. Another processor p4 wishes to 

create. Caching complex objects reduces the cost of creation change the value of x to 255. Before p4 can update x, it must 

by minimizing the frequency of regeneration of identical ensure that pi, p2, and p3 have invalidated x from their 

objects. The cost of generating objects in the absence of caches. It is only at this stage that p4 can update x. 

caching is reflected to end-users in terms of: (a) increased However, Web caches operate in a different environment 

response time; and (b) inconsistent response time. from the environment that processor caches operate in. In 

Consider a Web-based server with a very high frequency processor caches, incorrect behavior can result if a cache 

of access, whose content contains a high ratio of dynamic to 35 contains a value which is even a fraction of a second out of 

static pages. Assume further that the content of the dynamic date. For Web caches, it is often acceptable for a cached Web 

pages change frequently. When a page becomes obsolete and document to be slightly out of date. For example, suppose 

is flushed from cache: the first user who requests that page that a Web document w is contained in three caches (cl, c2, 

wiU experience a cache-miss, causing regeneration of that and c3) and that the Web document w is managed and 

page. Because the cost (and therefore, the physical wall- updated by a data source d. Using the multiprocessor cache 

clock time) of creating that page is great, there may be a coherence approach, the data source d must first invalidate 

significant probability of several other requests for that same the Web document w from cl, c2, and c3 before updating the 

page arriving before it is replaced in cache. This can result Web document Thus, the multiprocessor cache coherence 

in many simultaneous regenerations of the same page, and approach would cause the Web document w to be absent 

resultant wasted resources. A specific instance of this see- 45 from the cache for a certain period of time whenever the 

nario is a sports server, for example, serving the Olympics. Web document was updated. Requiring the data source d to 

Results for the currently active sports are arriving at a high invalidate the Web document w in caches before performing 

rate, causing the pages that reflect scores to change fre- the update, results in slower updates and cache misses 

quently; at the same time users are requesting those pages at during the extra time that the Web document w is not present 

a high rate to see the status of the event Because the pages 5Q id the cache. Thus, there is also a need for a method and 

are being invalidated frequently, a significant number of system which provides faster updates and higher cache hit 

requests cause the page to be regenerated. Thus there is a rates. The present invention addresses such a need, 
need for a system which maintains the validity of the page 

in one or more caches at all times, and automatically SUMMARY OF THE INVENTION 

replaces it when the underiying data changes, thereby reduc- 55 [n accordance with the aforementioned needs, the present 

ing system loading and significantly improving response invention is directed to a method and system for maintaining 

time. The present invention addresses such a need. updated caches and making consistent updates. 

Another problem is manffested on web senders where xhe present invention has features for constmcting and 

consistency of response time is critical. Once users have maintaining objects to associate chaages in remote data with 

accessed a site, or a location within a site, keeping their cached objects. In one embodimcn^data in a remote data 

attention may be of prime importance. For example, a source changes, database change noCfications are used to 

Web-based mail-order catalog may want to encourage . "trigger" a dynamic rebuild of associated objects. The 

browsing; if the user gets bored waiting for pages he or she information communicated from the data source to the cache 

may weU leave for other entertainment. can be either an identifielr of an object whose value has 

The present invention is of particular importance to proxy 65 changed, or information about the initially changed data. In 

caches (see "Caching Proxies: Limitations and Potentials'* the latter case, the cadie(s) receiving the information about 

by M. Abrams et al.. Fourth International World Wide Web the initiafly changed data would compute the identity of the 
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objects affected. In either event, rather than deleting stale FIG. 2 depicts an example of a cache used in accordance 

items from the cache when they become obsolete, they can with the present invention; 

be immediately replaced with fresh objects. According to no. 3 depicts an example of an object information block 

another aspect of the present invention, the objects can be (OIB) used in accordance with the present invention; 

compound-complex objecu, that is an object composed of s 4 depicts an example of API functions in accordance 

multiple complex objects; and the data can be underlying ^^jj invention- 

data. ' 

. ^ J. , . . FIG. 5 depicts a block diagram of a method for imple- 

In a system inchidmg one or more caches stormg objects nj^mijig functions of FIG. 4; 

and one or more remote data sources storing data which may rur^ ca - ^ uiij- c Anxc li. 

rc , 1 r L J u- * .u ju • f ^ in FIG. 6 depicts a block diagram of an API function which 

affect the value of a cached object, a method havmg features u- * * . 

^. I aQus an oDiect to a cacnei 

of the present mvention for coordinatm£ updates to a cache « / - . , , 

includes^ steps of recognizing when at least part of the , ^ ^^P^^^ ^ ^^.^^^^ ^x^zm of an API funcUon which 

data stor^in a remote data source has changed; commu- ^ ^ ^^^^^^^ 

nicating to a cache, one or more of: information about at ^ depicts a block diagram of an API function which 

least part of the data which has changed; and information 15 deletes an object from a cache; 

which includes the identity of at least one object whose- PIG. 9 depicts a block diagram of an API function which 

value has changed as the result of the changes to the data; adds a dependency from a record to an object; 

and information which allows the identity to be determined FIG. 10 depicts a block diagram of an API function which 

of at least one object whose value has changed as the result deletes a dependency from a record to an object; 

of the changes to the data; and updating a cache, in response 20 YVQ, 11 depicts a block diagram of an API function which 

to the communicating step. is invoked when a record changes; 

^^ccording to another aspect of the present invention, the HG. 12a depicts another example of a system having 

update can include either storing a new version of the object features of the present invention; 

in the cache; or deleting an object from the cache.^ FIG. 12^ depicts another example of an object depen- 

The present invention has features which ensure that ^ dence graph having features of the present invention; 

end-users never observe that an item is not in the cache, and FIG. 12c depicts an example of the object manager of 

that each item can be regenerated exactly once, regardless of FIG. 12a; 

the current rate of requests. FIG. \2d is another depiction of an object dependence 

TTie present invention has still other features for synchro- graph having features of the present invention; 

nizing caches on multiple servers with the data in a single FIG. 13 depicts an example of a cache used in accordance 

common database. Updated information, whether new pages with an embodiment of the present invention; 

or delete orders, can be broadcast to a set of server nodes, piG. 14 depicts an example of an object information block 

permitting many systems to simultaneously benefit from the (oiB) used in accordance with the present invention; 

advantages of prefetching and providing a very high degree 35 pjQ 15 ^^^^^^ ^n example of a dependency list used in 

of scalcability. accordance the present invention; 

In a system comprising a set of one or more transaction pjc. 16 depicts an example of a dependency information 

managers, a method for consistendy performing a set S of bio^k (DIB) used in accordance with the present invention; 

one or more state-changing transactions which modify state ^ ^^^^ ^^^^^ ^ ftj^ctions in 

managed by a set T of one or more transaction managers 40 accordance with the present invention; 

includes the steps of (a) ao^uiring a plurality of locks on data .^^ ^ ^^^j^ ^ ^ ^ ^^^^^ . 

fcaown as locked data which prevent toansactaons not m S ^^^^^ ^^^^^^^^ ^ ^ 

from one of (1) modifymg data accessed by a transaction in w^r^ ^n 1.1 1 / i_ *t,, 

c J /"^ \j- -1 * j c J u * *• • o A.\ FIG. 19 depicts a block diagram of a cache API function 

S and (11) reading data modified by a transaction in S; (b) . ^ 

i- .iij •• which adds the latest version or an obiect to a cache; 

stonng a blocked request set B compnsmg one or more 45 wioi^u <m vj^j^.*.* a va^^u^, 

transaction requests which cannot be completed because of ^^^P^^^ ^ ^^^^^ diagram of an API funcUon which 

locks acquired in step (a); (c) determining a limestamp at ^"^^^P^ ^ ^°Py ^ ^^^^^^ ^ ^^^i ^^^^^ 

which a last lock (last_lock_time) was obtained in step (a); another; 

(d) enabling transactions in B received before the last_ FIG. 21 depicts a block diagram of an API function which 

lock^time to access locked data before transactions in S 50 invoked when underlying data change; 

access the locked data; (e) enabhng transactions in S to FIG- 22 depicts a block diagram of part of a method for 

access the locked data before enabling transactions in B propagating changes through the object dependence graph in 

received after last_lock_time to access the locked data; and response to changes to underlying data; 

(f) enabling transactions in B received after the last_Jock_ FIG. 23 depicts a block diagram of part of a method for 

time to access the locked data after transactions in S have 55 propagating changes through the object dependence graph in 

accessed the locked data. a depth-first manner in response to changes to imderlying 

Hata' 

BRIEF DESCRIPTION OF THE DRAWINGS cy'r^ ^ • » ki t, ^- p . p .u a p 

FIG. 24 depicts a block diagram of part of a method for 

These and other features and advantages will become propagating changes to a specific graph object in response to 

apparent from the following detailed description and accom- changes to underlying data; 

panying drawings, wherein: piG. 25 depicts a block diagram of part of a method for 

FIG. la depicts an example of a system having features of updating or invalidating a cached version of an object in 

the present invention; response to changes to underlying data; 

FIG, 16 depicts an example of an object dependence FIG. 26 depicts a block diagram of part of a method for 

graph having features of the present invention; 65 maintaining consistency when one or more objects are added 

FIG. Ic depicts an example of a system having features of to one or more caches in response to changes to underlying 

the present invention; data; 
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FIG. 27 depicts a block diagram of a cache API fiinctioD Caches for Web documents such as the proxy cache in the 

for creating graph nodes corre^onding to single record IBM Internet Connection Server or the browser cache 

objects (SRO's); in the Netscape Navigator, 

FIG. 28 depicts a block diagram of an API funaion for Database caches such as in IBM's DB2 database; 

creating graph nodes corresponding to multiple record s Processor caches such as those in the IBM RS/6000 line 

objects (MRO's); of computers; and 

FIG. 29a depicts a block diagram of an API funcUon repositories for data written in a high-level pro- 

which may be invoked when records change; grammmg language, wherem for at least some data, the 

r^r> -^ni-j'. » storage repository program does not ha ve cxpHcit con- 

rlG. 29b depicts another example of an object depen- * i <• *u i u • i jj r l 

, u 5 u •* i_ J r i- i_ 10 trol of the virtual or physical addresses of where the 

dence graph and how it can be used for propagatmg changes ^^^^ ^^^^^ ^ 

to graph objects, ^ cache union is the combination of all caches in a 

FIG. 30a depicts a block diagram example of a system system, 

having features of the present invention for scaleably main- An object is data which can be stored in one or more 

taining and consistently updating caches; 55 caches. 

FIG. 306 depicts a more detailed example of the Trigger A multiple version cache is a cache which is allowed to 

Monitor of FIG. 30a instantiated as a Master Trigger Moni- include multiple versions of the same object, 

tor; A single version cache is a cache which is only allowed 

FIG. 30c depicts an example of the Trigger Monitor to include one version of the same object, 

instantiated as a Slave Trigger Monitor, ^ A current version cache Is a single version cache in which 

FIG. 30d depicts an example of the scnd_trigger API of ^^^^f ° 

FIG 30b' Underlymg data mclude all data in the system which may 

T-iy^^Aj-^ 1 i-. . • affect the value of one or more objects. Underlying data are 

FIG. 30e depicts examples of transaction types m accor- ^ ^ ^ 

dance with the present invenUon; ^ ^ ^^^^^^ ^^^^^^ ^ ^ ^^.^^^ ^^^^ 

FIG. 31 depicts an example of the Object Disposition dencies on underlying data. 

Block (ODB) of FIG. 30d; The object manager is a program which determines how 

FIG. 32 depicts an example of the cache ID of FIG. 31; changes to underlying data affect the values of objects. 

FIGS. 33A and 33B depict an example of a high-level A graph G=(V,E) consists of a finite, nonempty set of 
organization and communication paths of the Trigger Moni- 30 vertices Vako known as nodes and a set of edges E con- 
tor Driver and the Distribution Manager; sisting of pairs of vertices. If the edges are ordered pairs of 

HG. 34 depicts an example of the Receiving Tliread logic vertices (v, w), then the graph is said to be directed with v 

of FIG 33' being the source and w the target of the edge. 

TTTi- le'j 1 f*u T * iir 1 rx* A multigraph is similar to a graph. The kcy diffctcnce IS 

HG. 35 depicts ao example of the Incommg Work Dis- 3^ ^^^. ^^ ^^.^^ ^^^^^^ ^^^^^ 

patcher Hiread logic of FIG. 33; Multigraphs are supersets of graphs. 

FIG. 36 depicts an example of the Cache Manager Com- A weighted graph or weighted multigraph is one in which 

munications Thread logic of FIG. 33; each edge may optionally have a number known as a weight 

FIG. 37 depicts an example of the Objea Generator associated with it 

Thread logic of FIG. 33; 40 The object dependence graph is a directed multigraph. 

FIG. 38 depicts an example of the Distribution Manager Vertices of the object dependence graph are known as graph 

Thread logic of FIG. 33; objects. Graph objects are supersets of objects and may 

FIG. 39 depicts an example of the Outbound Transaction include the following: 

Thread logic of FIG. 33; (1) objects; 

HG. 40 depicts examples of extensions and variations for (2) xinderlying data which are not objects; and 

analysis and translations of Trigger Events; (^) virtual objects. 

HG. 41 depicts an example of logic for making a set of ^^"^ ^'""f ^^'j^^*^ correspond to actual data. 

c They are used as a convenience for propagatmg data depen- 

requests consistently to a system consistmg of one or more , t,- ^ 1 * • * . r Ti j /^\ j 

^I^K^e. cr.H dencics. Virtual objects are not as frequently used as (1) and 
cacnes, and 

HG. 42 depicts an example of logic for determining a ^ ^^^^ f^^^ ^ ^^^y^ -2 indicates a data 

last_Jock„Ume if the set of cache managers receivmg a dependence (also caUed dependence or dependency) from 

request has mulUple members. This means that a change to ol might also change 

DETAILED DESCRIPTION OF A METHOD FOR Dependencies are transitive. Thus, if a has a data 

DETERMINING dependence on b and b has a data dependence on c, then a 

has a dependence on c. 

HOW CHANGES TO UNDERLYING DATA ^ ^^^^ '".^ ^J'," f f'^?""^ "^'^^ 

AFFECT CACHED OBJECTS relational specifiers amuated with them. 2 examples of 

RO's are: 

Glossary of terms 60 1. Single record objects (SRO's); the relational specifier 

While dictionary meanings are also implied by terms used represents a single record, 

herein, the following glossary of some terms may be useful: 2. Multiple record objects (MRO's); the relational sped- 

A cache is a storage area. It may be in memory, on disk, fier represents multiple records, 

or partly in memory and partly on disk. The physical or An RO rl contains (includes) an RO r2 if all records 

virtual addresses corresponding to the cache may be fixed. 65 represented by r2 are also represented by rl. 

Alternatively, they may vary over time. The definition of The outgoing adjacency list for a node v is a list contain- 

caches includes but is not limited to the following: ing all nodes w for which the edge (v, w) is contained in E. 
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The incoming adjacency list for a node v is a list con- 99 can be similarly associated with the client 90, in accor- 

taining all nodes w for which the edge (w, v) is contained in ydttnce ' 

E- t TEccache manager 1 provides APls}(FIG. 4) for speci- 

A leaf node is a node which is not the target of any edges, m nE^whalund rrl}dn[; dntn, d g ^ d^Trrtfrc records, a cached 

A proper leaf node is a leaf node which is the source of 5 object depends upon. The cache manager 1 keeps track of 

at least one edge. these dependencies. Whenever a process modifies state 

A maximal node is a node which is not the source of any which could affect the value of a complex object in a cache, 

edges. the process specifies the underlying data which it is updat- 

A proper maximal node is a maximal node which is the Th© cadie manager then invalidates all cached objects 

target of at least one edge. jq which depend on the imderlying data being updatc& 

A simple dependence graph is a directed graph in which depicts an example of an object dependence 

each node is a leaf node or a maximal node. Sraph (G) 121' having features of the present invention. Note 

TWo objects ol and o2 are consistent if either. ^^^^^^ dependence graph (G) 121' in this embodi- 

/i\ D .u L- * * naent is less complex than in the alternative embodiment 

1 Both objects arc current; or j^fe). Heref the object dependence graph 121' is a 

(2) At some tmie t m the past, both objects were current. 15 simple dependence graph, i.e., a directed graph in which 

Aversion number is data which allows different versions each node is a leaf node rl . . . r3 or a maximal node col, 

of the same object to be uniquely identified. One implemen- co2. Recall that a leaf node is a node which is not the target 

tation would be to use integers for version numbers and to of any edges and a maximal node is a node which is not the 

assign a newly created current version, the version number source of any edges. Also note that every path is of length 

of the previous version plus 1. However, other implemen- 20 1 and there is no need to specify weights for edges. Further, 

tations are also possible and version numbers do not nec- each proper maximal node (a maximal node which is the 

essarily have to be numbers. For example, text strings could target of at least one edge) col, co2 is an object; and each 

also be used to implement version numbers. proper leaf node rl . . . r4 (a leaf node which is the source 

The most recent version of an object is known as the of at least one edge) in G represents underlying data which 

current version. 25 is not an object The underlying data represented by each 

Referring now to the drawings, FIG. la depicts an proper leaf node rl . . . r4 is referred to as a record (These 

example of a client-server architecmre having features of the records are not synonymoiis with records used in the second 

present invention. As depicted, a client 90 communicates embodiment). Tlie objects represented by proper maximal 

requests to a server 100 over a network 95. The server 100 nodes col, co2 are complex objects, 

maintains one or more caches 2. As is conventional, the 30 The cache manager 1 maintains the underlying data 

server 100 uses the caches 2 to improve performance and structures (see FIGS. 2-3) which represent the object depen- 

lesscn the CPU time for satisfying the client 90 requests. dence graph(s) 121'. Application programs 97 communicate 

Although FIG. la shows the caches 2 associated with a the stmcture of object dependence graphs to the cache 

single server, the caches 2 could be maintained across manager 1 via a set of cache APIs (see FIG. 4). The 

multiple servers as well. One skilled in the art could easily 35 application also uses APIs to notify the object manager 1 of 

adapt the present invention for other applications which arc records rl . . . r4 which have changed. When the cache 

not client-server based as well. manager 1 is notified of changes to a record rl . . , r4, it must 

An application program 97 running on the server 100 identify which complex objects col, co2 have been affected 

creates objects and then stores those objects (e.g., dynamic and catise the identified complex objects to be deleted (or 

pages which do not cause state changes upon a request 40 updated) from any caches 2 containing them. The cache 

therefor) on one or more caches 2. The server 100 can also manager 1 can determine which complex objects have 

be a proxy server wherein the source of the imderlying data changed by examining edges in G (see FIG. 11). 

in the database 99 and the cache 2 are geographically For example, suppose that the cache manager 1 is notified 

separated. In this embodiment, an object is data which can that rl has changed, G 121' implies that col has also 

be stored in one or more caches 2. The objects can be 45 changed. The cache manager 1 must make sure that col is 

constructed from underlying data stored on a database 99. deleted (or updated) from any caches 2 containing it. As 

Underlying data include all data in the system which may another example, suppose that r2 has changed. G 121' 

affect the value of one or more objects stored in the cache 2, implies that col and co2 are also affected. Here, the cache 

Underlying data are a superset of all objects in the system. manager must make sure that both col and co2 are deleted 

A complex object is an object with one or more dependen- 50 (or updated) from any caches 2 containing them, 

cies on the underlying data. In other words, the basic approach is to construct complex 

Also, let the caches 2 in the cache union aU be current objects at the application level so that they are dependent on 

version caches. Recall that a current version cache is a single a set of records. The application must be able to specify 

version cache in which the version of any cached object which records rl . . . r4 a complex object col, co2 depends 

must be current, and that a single version cache is a cache 55 upon. For every process which modifies state in a marmer 

which is only allowed to include one version of the same which could affect the value of a cached complex object, the 

object. application program must be able to specify which records 

According to the present invention, a cache manager 1 are affected. Complex objects of this type are said to be in 

(which is an example of an object manager) determines how normal form. Many preexisting Web applications create 

changes to underlying data affect the values of objects. 60 cacheable complex objects which are already in normal 

Although FIG. la shows the cache manager 1 residing on a form. In order to use cadiing in these applications, it is only 

single server, it could be distributed across multiple servers necessary to recognize the records underlying complex 

as well. The cache manager 1 is preferably embodied as objects and to interface the application to the cache via the 

computer executable code tangibly embodied on a program APIs provided. Other changes to the applications are not 

storage device for execution on a computer such as the 65 necessary. 

server 100 (or the client 90). Those skilled in the art will Preferably, the cache manager 1 is a long running process 

appreciate that the cache 2, cache manager 1, and database managing storage for one or more caches 2. However, one 
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skilled in the art could easily adapt the present invention for 
a cache manager which is one of the following: 

1. Multiple distinct processes, none of whidi overlap in 
time. 

2. Multiple distinct processes, some of which may overlap 5 
in time. This includes multiple concurrent cache managers 
so designed to improve the throughput of the cache system. 

FIG. Ic depicts an example of a system in accordance 
with the present invention for caching dynamic Web pages. 
As depicted, consider a conventional Web site 100 where lO 
pages (page 1 page 5) are created dynamically from one or 
more databases 99 and stored in one or more caches 2. An 
example of a database 99 and database management system 
adaptable to the present invention is that sold by the IBM 
Corporation under-the trademark DB2. Here, the dynamic 15 
Web pages (page 1 . . . page 5) are objects and the underlying 
data (tables/records) include parts of databases 99. 

According to the present invention, a cache manager 1 
provides API's (FIG. 4) which allow an application 97 
program to specify the records that a cached object depends 20 
upon. The cache manager 1 keeps track of these dependen- 
cies. Whenever an application program 97 modifies a record 
(s) or learas about changes to a record which could affect the 
value of a complex object in a cache, the application 
program 97 notifies the cache manager 1 of the record(s) is 
which has been updated. The cache manager 1 then invali- 
dates or updates all cached objects with dependencies on the 
record (s) which has changed. 

For example, consider the HTML pages (page 1 . . . page 
5) depicted in FIG, Ic The HTML pages, which are com- 30 
plex objects, are constructed from a database 99 and stored 
in CacheS. Each HTMLpage may have dependencies on one 
or more records which are portions of the database denoted 
Tablel, Table2, ...» Table6. The correspondence between 
the tables and pages can be maintained by hash tables and 35 
record lists 19. For example, if the cache manager 1 were 
notified of a change to Tablel Tl, it would invalidate (or 
update) Pagel. Similarly, if the cache manager were notified 
of a change to Table2 T2, it would invalidate (or update) 
Pagel, Page2, and Page3. 40 

FIG. 2 depicts an example of the cache 2. As depicted, 
each cache 2 preferably has 4 storage areas: a directory 3, 
maintains information about each cached object; an object 
storage 4 for storing the objects 6; auxiliary slate informa- 
tion 5 which includes other state information (e.g., statistics 45 
maintained by the cache); and a hash table 19, which stores 
information about records, in the hash table entries 25. 

In a preferred embodiment, the hash table entries 25 
comprise record IDs 12; and object lists 8, which include the 
list of objects, i.e., object id(s) 9, whose values depend on a 50 
record which may be part of a database 99. However, the 
present invention also allows other kinds of information to 
be stored in the hash table entries. The purpose of the hash 
table is to provide an efficient method for finding informa- 
tion about a particular table/record. Preferably hashing is 55 
keyed on the record ID 12. Hash tables are well known in the 
art (see e.g., "The Design and Analysis of Computer 
Algorithms", Aho, Hopcroft, Ulinian, Addison- Wesley, 
1974). Hash tables provide an efiScient data structure for the 
present invention. However, the present invention is com- 60 
patible with a wide variety of other data structures and is not 
limited to using hash tables. 

The directory 3 includes an object information block 
(GIB) 10 for each object 6 stored in the cache. One of the 
components of the GIB 10 is a record list 11 (FIG. 3) which 65 
is used to store all of the record ID*s 12 identifying records 
rl . . . r4 associated with a complex object col, co2. Here, 



the complex objects are dynamic web pages (page 1 . . . page 
5) stored in the cache 2 and the records may be part of a 
database 99. Although the preferred embodiment uses text 
strings for record ID's, other methods are compatible as 
well. 

An application program communicates with the cache 
manager 1 via a set of API functions. Examples of APIs in 
accordance with the present invention are shown in FIG. 4. 
Those skilled in the art will appreciate that many additional 
APIs can be implemented in a straightforward manner 
within the spirit and scope of the present invention. As 
depicted, the example APIs are: 

cache_object (object_d, object, cache__id) 410: stores an 
object 6 identified by cache_id in the cache 2 (FIG. 2) 
identified by cache id under a key obj6ct_id 9; overwriting 
any previous object 6 having the same key. The present 
invention is compatible with a wide variety of types for 
object_id, object, and cache_id. In the preferred 
embodiment, the object 6 may be of several types, the 
object_id is a byte string, and the cache_id is a character 
string. Here, although multiple items with the same key are 
preferably not allowed to exist in the same cache concur- 
rently. However, it would be easy for one skilled in the art 
to use the present invention in a situation where multiple 
items with the same key could exist in the same cache 
concurrently. 

lookup_object (object_id, cache _id) 415: look for an 
object 6 identified by cache„id with a key object_id 9 in the 
cache 2, If any such object 6 exists, return it to the appli- 
cation program. 

delete__object (object id, cache_id) 420: look for an 

object 6 identified by cache_id with a key object_Jd 9 in the 
cache. If any such object 6 exists, delete it. 

add__dependency (object_id, cache_id., record_id) 430: 
look for an object 6 with a key object_id 9 in the cache 2 
identified by cache__id. If any such object 6 exists and there 
is no dependency between the object 6 and a record iden- 
tified by a record_id 12 associated with the record _id, add 

the dependency. del6te_dependency (object id, cache^d, 

record_id) 440: look for an object 6 with a key object_id 
9 in the cache identified by cache_id. If any such object 6 
exists and there is a dependency between the object 6 and a 
record identified by record_id 12, delete the dependency. 

invalidate_.record (cache_id, record_id) 450: delete all 
cache objects from the cache 2 identified by cache^d which 
depend on the record identified by the record_jd. 

show_dependent_objects (cache^d, record_Jd) 460: 
return a list of object_ids 9 for all objects in the cache 2 
identified by the cache ^d which depend on the record 
identified by the record__id. This function can be imple- 
mented by returning the object list 8 for the hash table entry 
25 corresponding to the record identified by recorded. A 
status variable can also be returned to indicate if either the 
cache 2 or the hash table entry 25 is not found. 

show_associated_records (cache_id, object_id) 470: 
return a list of record_Jds 12 for aU records which the object 
6, identified by object_id in the cache 2 identified by 
cache_id, depends on. This fimction can be implemented by 
returning the record list 11 (FIG. 3) for the object 6 identified 
by the object_id in the cache 2 identified by the cache_id. 
A status variable can also returned to indicate if cither the 
cache or the object 6 is not found, 

FIG. 5 depicts an example of the cache manager 1 logic. 
As depicted, in step 1010 the cache manager receives a 
command (FIG. 4) from an application program. In step 
1020, the cache manager reads the command (FIG. 4) and 
invokes different logic 1100 ,. . 1600, described below, 
based on the command. 



03/10/2004, EAST Version: 1.4.1 



•1 



us 6,216^12 Bl 

11 12 

FIG. 6 depicts an example of the cache manager logic also deleted. After each element of the record list 11 is 

1200 for a cache_object (object_„id, object, cache_id) 410 examined, it can be deleted. 

command. As depicted, in step 1200, the cache manager 1 In step 1120, the object 6 is deleted from the object 

determines if the cache_id parameter specifies a valid cache storage 4. In step 1130, the corresponding OIB 10 is deleted. 

2. If not, the status variable to be retumed to the application 5 Note that step 1120 can be perform ed concurrently with or 

program is set appropriately, in step 1245. If the cachc_id before steps 1110 and 1130. In step 1140, the cache is 

specifies a valid cache 2, the cache 2 is preferably locked, to unlocked and in step 1150, a status variable is retumed to the 

prevent muUiple processes from accessing the cache con- application program 

currently. That way, consistency is preserved. Those skilled FIG. 9 depicts an example of logic for the add_ 
in the art will appreciate that other locking schemes could be lo dependency (object_Jd, cache^d, record_id) 430 corn- 
used to provide higher levels of concurrency. The present mand. As depicted, in step 1300, the cache manager deter- 
invention is compatible with a wide variety of conventional mines if the cache_id parameter specifies a cache 2 which 
locking schemes in addition to the example used in the is valid. If not, a status variable is appropriately set, in step 
prefened embodiment. 1302 and retumed to the application program, in step 1360. 

In step 1205, the cache manager 1 searches for the object 15 If in step 1300, it is determined that the cache Jd sped- 

6 by examining the directory 3 (FIG. 2). If a previous copy fies a valid cache, the cache 2 is locked, in step 1305. In step 

of the object 6 is located, the OIB 10 for the object 6 is 1310, the cache manager 1 searches for the object 6 corre- 

updated, the old version of the object 6 in object storage 4 spending to the object_id by examining the directory 3 

is replaced by the new one, and the status variable is set (FIG. 2). If in step 1310, the object 6 is not found: the cache 

appropriately, in step 1215. If, in step 1205, a previous copy 20 2 is unlocked, in step 1315; the status variable is set in step 

of the object 6 is not found, a new OIB 10 for the object 6 1317; and an appropriate status variable is returned to the 

is created, initialized, and stored in the directory 3, in step application program, in step 1360, If in step 1310, the object 

1210. The cache manager 1 also stores the object 6 in the 6 is found: the cache manager 1 examines the record list 11 

object storage 4 and sets the status variable appropriately. I(FIG. 3) in step 1320 to see if an association (i.e. the 

In step 1230, the cache is unlocked to allow other pro- 25 dependency information) between the object 6 and a record 

cesses to update it. In step 1240, the status variable indicat- identified by the record_id already exists. Alternatively, it 

ing the result of the command is retumed to the application can be determined if the record corresponding lo the 

program. Processing then returns to step 1010 (FIG. 5). record_id has a hash table entry 25 and if so, to search for 

FIG. 7 depicts an example of logic for the lookup object the object_id 9 on the object list 8. If in step 1320, a 

(object id, cache_id) 415 command. As depicted, in step 30 dependency to the object exists, the cache 2 is unlocked in 

1600, the cache manager 1 determines if the cache^id step 1325; the status variable is set appropriately in step 

parameter specifies a valid cache 2. If not, in step 1640 a 1327; and an appropriate status variable is returned to the 

status variable is set appropriately and retumed, in step application program, in step 1360. If in step 1320, no 

1680, to the application program. If the cache_id specifies dependency to the object is found, in step 1330 an object_id 

a valid cache, in step 1610 the cache 2 is locked. 35 9 is added to the object list 8 for the record. Anew hash table 

In step 1620, the cache manager 1 searches for an object entry 25 and object list 8 are created for the record if needed. 

6 corresponding to the object_Jd parameter by examining In step 1340, a record_id 12 is added to the record fist 11 

the directory 3 (FIG, 2). If the object 6 is not found: the (FIG. 3) for the object 6. Note that step 1340 can be executed 

cache 2 is unlocked in step 1650; the status variable is set in concurrently with or before step 1330. The cache 2 is 

step 1670 and retumed to the appfication program in step 40 unlocked, in step 1350 and the status variable is returned to 

1680. If in step 1620 the object 6 is found: the cache 2 is the application program, in step 1360. 

unlocked in step 1630; and the object 6 is returned to the FIG. 10 depicts an example of logic for the delete 

application program in step 1660. dependency (object_id, cache^d, record_id) 440 com- 

FIG. 8 depicts an example of logic for the delete_object mand. As depicted, in step 1400, the cache manager 1 

(object_Jd, cache_id) 420 command. As depicted, in step 45 determines if the cachc_id parameter specifies a cache 2 

1100 the cache manager 1 determines if the cache 2 corre- which is valid. If not, in step 1402 a status variable is set 

sponding to the cache__id parameter is valid. If not-valid, in appropriately and returned to the application program, in 

step 1103 a status variable is set appropriately, and in step step 1460. 

1150 the status variable is retumed to the application pro- In step 1400, if it is determined that the cache^d sped- 

gram. 50 fies a valid cache, in step 1405 the cache is locked. In step 

If in step 1100 the cache_id specifies a cache 2 which is 1410, the cache manager 1 searches for the object 6 corre- 

valid, that cache is locked in step 1105. In step 1107, the sponding to the object_id by examining the directory 3 

cache manager 1 searches for an object 6 corresponding to (FIG. 2). If in step 1410 the object 6 is not found: the cache 

the object_id parameter by examining the directory 3 (FIG. 2 is unlocked, in step 1412; the status variable is set in step 

2) . If the object 6 is not found: the cache is unlocked in step ss 1415 and retumed to the application program, in step 1460. 
1108; the status variable is set in step 1109; and in step 1150 If in step 1410, the object 6 is found: the cache manager 1 
the status variable is retumed to the application program. If examines the record list U (FIG. 3), in step 1420 to see if 
in step 1107 the object 6 is found, in step UIO the cache an association (i.e. the dependency information) between the 
manager 1 deletes the objects' associated record fist 11 (FIG, object 6 and a record identified by the record_Jd already 

3) and updates the corresponding objects lists 8 (FIG. 2), 60 exists. Alternatively, it can be detemaincd if the record 
The cache manager 1 scans each record ID 12 of the record corresponding to the record_id has a hash table entry 25 and 
list 11 (FIG. 3) corresponding to the object 6. Note that each if so, to search for object_id 9 on the object list 8, If in step 
record ID 12 on the record list 11 has a corresponding object 1420, no dependency is found, in step 1422 the cache 2 is 
list 8 (FIG. 2). Pointers to object id(s) 9 (FIG. 2) corre- unlocked; the status variable is set appropriately in step 
sponding to the object 6 being deleted are removed from all 65 1425; and an appropriate status variable is returned to the 
such object lists 8, If this results in any object list 8 application program, in step 1460, If in step 1420, a depen- 
becoming empty, the corresponding hash table entry 25 is dency to the object is found, in step 1430 the object_id 9 is 
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deleted from the object list 8 for the record. If this makes the 
object list empty, the hash table entry 25 for the record is 
deleted. In step 1440, the record _Jd 12 is removed from the 
record list 11 (FIG. 3) for the object 6. Note that step 1440 
can be executed concurrently with or before step 1430. The 
cache is unlocked, in step 1450 and the status variable is 
returned to the application program in step 1460. 

FIG. 11 depicts an example of logic for the invalidat6_ 
record (cache_id, record_„id) 450 command. As depicted, in 
step 1500, the cache manager 1 determines if the cache_id 
parameter specifies a cache 2 which is valid. If the cache is 
not vaUd, a status variable is set appropriately in step 1502, 
and returned, in step 1550 to the application program. 

If in step 1500, the cache manager 1 determines the 
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improving the performance of server applications in a con- 
ventional client-server environment One skilled in the art 
could easily adapt the present invention for other applica- 
tions which are not client-server based as well. As depicted, 
a client-server architecture wherein a client 90 conmiuni- 
cates with a server 100 over a network 95. A server 100 
maintains one or more caches 2\ As is conventional, the 
server 100 uses the caches 2' to improve performance and 
lessen the CPU time for satisfying client 90 requests. 
Although FIG. 12a shows the caches 2* associated with a 
single server, the caches 2' could be maintained across 
multiple servers as well. 

An application running on the server 100 creates objects 
and then stores those objects on one or more caches T. The 



cache_id parameter specifies a cache 2 which is valid, the 15 system can also be architected such that the source of the 

cache 2 is locked, in step 1505. In step 1510, the cache underlying data in the database 99 and the cache 2' are 

manager determines if the values of any objects 6 are geographically separated. In this embodiment, an object is 

dependent on a record associated with the record _Jd by data which can be stored in one or more caches 2'. The 

seeing if the record has a hash table entry 25 (FIG, 2). If no objects can be constructed from underlying data stored on a 

hash table entry 25 is found for the record, the cache is 20 database 99. Underlying data include ail data in the system 

unlocked in step 1515 and the status variable is set in step which may affect the value of one or more objects. Under- 

1517. lying data are a superset of aU objects in the system. 

If in step 1510, a hash table entry 25 is found for the According to the present invention, the object manager 

record, the cache manager 1 scans the object list 8 for the 120 is preferably embodied as computer executable code 

record, in step 1520. Each object 6 having an object ID 9 on 25 ("program") tangibly embodied in a computer readable 

the object list 8 is deleted from the cache. As each object 6 medium for execution on a computer such as the server 100 

is deleted, all references to the object 6 from other object (or client 90). The object manager 120 helps determine how 

Usts 8 are also deleted. Such references can be found by changestoundedyingdataaffectthe values of objects in the 

traversing the record list 11 (FIG. 3) for the object 6 being caches 2'. Although FIG. 12fl shows the object manager 

deleted. If the deletion of any such reference results in an 30 residing on a single server, it could be distributed across 

empty object list, the corresponding hash table entry is multiple servers as well. The object manager 120 is prefer- 

deleted. After each element of the object list 8 associated ably a long running process managing storage for one or 

with the record_id 12 (corresponding to the record_id more caches 2*. The term cache is very generic and can 

parameter) is examined, the element can be deleted. In step include any application (e.g., a client 90 application) in 

1530, the hash table entry 25 for the record is deleted. The 35 addition to caches in the conventional sense. One skilled in 



cache is unlocked in step 1540 and the status variable is 
returned to the apphcation program, in step 1550. 

A straightforward extension of the invalidate_jecord 
function which could be implemented by one skilled in the 
art would be to update one or more objects which depend on 
the record_id parameter instead of invalidating them. 

Step 1099 represents other commands which the cache 
manager might process. Those skilled in the art will appre- 
ciate that there are numerous extensions and variations 



the art could easily adapt the present invention for an object 
manager which is one of the following: 

1. Multiple distinct processes, none of which overlap in 
time; and 

2. Multiple distinct processes, some of which may overlap 
in time. This includes multiple concurrent object managers 
so designed to improve the throughput of the system. 

FIG. 12b depicts an example of an object dependence 
graph 121 having features of the present invention. The 



within the scope and spirit of the present invention. For 45 object dependence graph 121 (abbreviated by G) represents 



example, one variation is to allow the cache manager 1 to 
preserve and update the OIB 10 (FIG. 2) for an object 6 both 
before the object 6 is ever cached and after the object 6 has 
been removed from the cache. Using this approach, it would 
not be necessary to delete the record list 11 for an object 6 
and remove the object 6 from all object lists 8 when the 
object 6 is removed from the cache. That way, dependency 
information could be preserved and even updated while the 
object 6 is not in the cache. 

Another variation would be to allow the cache manager 1 
to maintain and update a hash table entry 25 for a record both 
before any objects are added to the object list 8 and after the 
object list 8 becomes empty. In other words before the cache 
manager is aware of any dependency on the record and after 
all dependencies on the record which the cache manager is 
aware of become obsolete. This would be particularly valu- 
able if hash table entries 25 include other information in 
addition to record ID's 12 and object lists 8. 
Alternative Embodiment 
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the data dependencies between graph objects gobjl . 
gobjn. Here, gobjl, . . . , gobj7 represent different graph 
objects and the edges e in the graph represent data depen- 
dencies. For example, the edge from gobjl to gobjS indi- 
cates that if gobjl has changed, then gobj5 has also changed. 
The weight w of the edge is an indication of how much a 
change to an object, which is the source of an edge, affects 
the object which is the target of the edge. For example, a 
change to gobjl wotild imply a more significant change in 
gobj5 than a ciange in gobj2. This is because the weight w 
of the edge e from gobjl to gobj5 is 5 times the weight w of 
the edge e from gobj2 to gobj5. 

The object manager 120 is responsible for maintaining the 
underlying data structures which represent object depen- 
dence graphs (see FIGS. i2a~-c and 16). Application pro- 
grams communicate the structure of object dependence 
graphs to the object manager via a set of APIs (see FIG. 
18fl). The application also uses APIs to notify the object 
manager of underlying data which have changed. When the 



FIG. 12a depicts another example of a system having 65 object manager 120 is notified of changes to underlying data, 
features of the present inventioa In this as well as the it must determine which other objects have changed and 
previous embodiment, the present invention can be used for notify the caches 2' of the changes. It determines which other 
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objects have changed by following edges in the object 
dependence graph (see FIG. 21). 

For example, suppose that the object manager 120 is told 
that gobjl has changed. By following edges in the object 
dependence graph 121 from gobjl, it determines that both 
gobj5 and gobj7 have also changed. As another example, 
suppose that the object manager is told that gobj7 has 
changed. Since there are no edges in the object dependence 
graph for which gobj? is the source, the object manager 
concludes that no other objects are affected. 

FIG. 12c depicts an example of an object manager 120 
having features of the present invention. As depicted, the 
object manager 120 includes several storage areas: 

1. The object dependence graph G 121 (see FIG. ^2cf) 
which is implemented by multiple dependency information 
blocks (DIBs) 128. Those skilled in the art will appreciate 
that the DIBs can be stored using a variety of data structures. 
Preferably, conventional hash tables are used wherein the 
DIBs are indexed by object_Jds 160. Hash tables are 
described, for example, in "The Design and Analysis of 
Computer Algorithms", Aho, Hopcroft, Ullman, Addison- 
Wesley, 1974. 

2. The multiple record tree (MRT) 122 (see FIGS. 27-28). 

3. The single record tree (SRT) 123 (see FIGS. 27-28.). 

4. Auxiliary state information 124 which includes but is 
not limited to the following: 

a. num_updates 125: a counter numi -updates 125, main- 
tained by the object manager for tracking the number of 
updates the object manager has propagated thrugh the 
graph. 

b. consistency stack 128.5 : Used for maintaining consis- 
tency during updates. 

c. relation info 129 (see FIGS. 27-28). 

5. program logic 126. 

FIG. 13 depicts an example of the storage areas main- 
tained by each cache 127. Each cache has a cache id 135 

field which identifies it. There are 3 main storage areas: 

1. Directory 130: Maintains information about objects. 
The directory 130 includes multiple object information 



weight__act 152: a number representing how consistent 
the most recent version of o2 is with the cached version of 
ol. The preferred embodiment uses values of 0 (totally 
inconsistent) or the weight 165 (FIG. 16) for the correspond- 
ing edge in the dependency information block (DEB) 128 
(see FIG. 16) (totally consistent), A straightforward exten- 
sion would allow values in between these two extremes to 
represent degrees of inconsistency, and 

version_jium 153: the version_num of o2 which is 
consistent with the cached version of ol; 

FIG. 16 depicts an example of the dependency informa- 
tion block (10) 128 of FIG. 12. As depicted, the DIB 128 
preferably includes the following fields: 

object_„id 160: used by the apphcation program to iden- 
tify the graph object. Assume for the purposes of the 
following discussion that a graph object has an object_id 
ol; 

version_num 161: version number for the current version 
of the graph object; 

timestamp 162: timestamp for the current version of the 
graph object; 

storage_list 163 (for graph objects which are objects): list 
of cache_id's for all caches containing the object; 

incoming_dep 164: list of (object_id 160, weight 165) 
pairs for all graph objects o2 with dependency edges to ol. 
25 The weight 165 represents the importance of the depen- 
dency. For example, higher numbers can represent more 
important dependencies; 

outgoing_dep 166: list of all object_id*s for which there 
exists a dependency edge originating from ol; 

sum_weight 167: the sum of the weights of all depen- 
dency edges going into ol; 

threshold„weight 168 (for graph objects which are 
objects): number representing when an object should be 
considered highly obsolete. Whenever the actual_weight 
143 field in an GIB 10' (FIG. 14) faUs below the threshold_ 
weight 168 field for the object, the object is considered to be 
highly obsolete and should be invalidated or updated from 
the cache; 

consistency_list 169 (for graph objects which are 
objects): a list of object_id's 160 corresponding to other 



35 



blocks (DIBs) 10'. Information about an object may be 40 objects which must be kept consistent with the current 



retained in an OIB 10' (FIG. 14) after the object leaves the 
cache. Those skilled in the art will appreciate that the OIBs 
can be stored using a variety of data structures. Preferably, 
conventional hash tables are used wherein the OIBs are 
indexed by object_id's 160. 

2. Object storage 132: Where objects contained in the 
cache are stored. 

3. Auxiliary state information 124: Includes other state 
infornation, e.g., the cache _Jd 135. 

FIG. 14 depicts an example of an OIB 10'. The OIB 
preferably includes the following: 

object_id 160: assume for the purposes of the following 
discussion that an object has an object_id ol; 

version__num 141: allows the object manager to uniquely 
identify different versions of the same object; 

timestamp 142: a number which indicates how recently 
the object was calculated; 

actual_wcight 143: the sum of the weights of all edges to 
ol from a graph object o2 such that the cached version of ol 
is consistent with the current version of o2; and 

dep_list 144: a list representing dependencies to the 
object ol. 

FIG. 15 depicts an example of a dep_list 144 element. As 
depicted, each list preferably includes: 

object_^ 160: represents a graph object o2 which has a 
dependency edge to ol, i.e., o2 is the source and ol the 
target; 
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object. Preferably, consistency is only enforced among 
objects within the same cache. A straightforward extension 
would be to enforce consistency of objects across multiple 
caches. Another straightforward extension would be one 
which required all objects on the list 169 to be in/out of the 
cache whenever the object__id is in/out of the cache; 

latcst_object 1601 (for graph objects which are objects): 
a pointer to the latest version of the object, null if the object 
manager is imawarc of such a copy. This field allows an 
object to be updated in multiple caches without recalculating 
its value each time; 

relational_string 1602: null if the graph object is not a 
relational object. Otherwise, this is of the form: relation_ 
name (25, 30) for SRO's and relation_name (>-50) for 
MRO*s. The following are only of relevance if relational^ 
string 1602 is not nuU; 

multiple_records 1603: true if the graph object is a 
multiple record object (MRO), false if it is a single record 
object (SRO); 

The following are only of relevance if multiple_rccords 
1603 is true: 

mro_dep_weight 1604: the weight assigned to an 
implicit dependency from another relational object to ol; 
and 

mro_threshold^ncrement 1605: for each implicit depen- 
dency to ol, the amotmt the threshold_weight should be 
incremented. 
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Referring again to FIG. 12, the object manager preferably Those skilled in the art will appreciate that many addi- 

also maintains a counter num_updates 125 (initially zero) tional APIs can be implemented in a straightforward manner 

which tracks the number of updates the object manager has within the spirit and scope of the present invention. For 

propagated through the graph. The object manager also example, APIs can also be added to delete dependencies and 

maintains a data structure (initially empty) called the con- $ modify dependency weights 

sistency stack 128.5 (HG. 12c) which is used to preserve consistency_iist 169-corresponding to an object 

consistency among objects m caches -obj_id" -is set via an API call: 

T^e application program ^7 preferably commimicates define_consistencyJist (obj^d, list_oLobjects) 183. 

with the object manager via a set of API functions. HG, 17 consistency list for the obj_id is preferably not al owed 

depicts examples of several APIs m accordance with the „ , . , , ^ . fl .^^^ , . 

present invention. Those skilled in the art will appreciate ^ '"^^^^^^ ""^^^^ ^ "^^"^^^'^ 

that otherAPIs can be implemented that are straightforward irom occurring. 

extensions in view of the present invention. similarly be added within the spirit and scope of 

FIG. 18 depicts an example of the object manager 120 ^® present mvention to modify the consistency lists 169 

logic for handling different API functions. These functions creation. 

will be described in detail later By way of overview, nodes Changes to the dependency information block (DIB) 128 

in the object dependence graph G 121 can be created via the (F^^. 16) for an object, after an object has been cached may 

API call to the object manager create_node (obj_id, require updates to one or more caches 127. These are 

initial_version__oum, thresh_weight) 181. Dependencies straightforward. In the event of a new dependency to a 

between existing nodes in the graph can be aeated via the cached object ol from a new graph object o2, the new 

API call: add__dependency (source_„object_id, target_ 20 dependence is obsolete if the object manager doesn't know 

object_id, dep__weight) 182. The consistency_list 169— when o2 was created, or the DIB timestamp 162 for o2>0IB 

corresponding to an object "obj_id" — can be set via the API timestamp 142 for ol. Nodes can be deleted from G via the 

call: define„consLstency_list (obj_id, list_of_objects) API, deletejode (obj_id) 184. 

183. Nodes can be deleted from G via the API delete_jiod6 Objects can be explicitly added to caches via the APIs: 

^^^'H'll/^l-. f^^ cache_latest_version (obj_id, ^ cache Jatest_version (obj_id, cache) 185; and copy_ 

cache) 185 adds the atest version of an object to a cache. object (obj_id, to_cache_id, from_cache_id) 186, These 

The API copy object (obj_id, to cache_id, from cache_ ^^..j, ojg,^ ^5 ^ ^^^^^ ^ ^ 

id) 186 attempts to copy a version of an object from one ^^^.^ ^^^^^ ^^^^ 

cache to another cache. Objects are deleted from a cache via ^ir^ m i ■ * 1 r .u Anr 1. 1 * * 

the API call: delete.objeci (obj_id, cache) 187. ,^ jl^'^f' ^ ^"^^^^^ °/ T^T^f ^ 

An application program which changes the value of ^° version (obj id, cache 185. As depicted, m step 2030, it is 

underlying data must inform the object manager. TVo API ^^^^f ^ ^^^^ ^^'J-^ '^^^^^ parameters specify exist- 

calls for achieving this are: object_has_changed (obj_Jd) objects and caches, respectively. If so, processing pro- 

188 where the obj„id parameter identifies a graph object; ^^^^ ^ step 2040. If not, an appropriate status message is 

and objects_have__changed (hst_of_objects) 189 where returned and processing proceeds to step 2010. In step 2040, 

the list_o£_obj6cts parameter includes a list of pointers to) ^5 it is determined if the latest version of an obj_id is in the 

graph objects. cache. If so, processing continues with step 2010. If not, in 

A node corresponding to an SRO is created via the API step 2050 an attempt is made to obtain the latest version of 

call create__sro_node (obj_id, initial_version_num, obj_id from the latest_object field 1601 in the dependency 

thresh_weight, relation_name, list„of_attribute_values) information block (DIB) 128 (FIG. 16). If this field is null, 

190. 40 in step 2050, the latest value of obj_id (and possibly makes 

MRQ's are created via the API: create_mro_node (obj_ its value accessible through the latest_object field 1601 of 

id, inital_version_num, thresh_weight, relation_name, the DIB) is calculated, and the version_num field 161 in the 

list_of_attribute_conditions, reL.de fault„weight, rel_ dependency information block (DIB) 128 (FIG. 16) is 

default_lhreshold) 191. updated. In step 2050, either the new version of obj_id is 

The API compare_objects (obj^d, cache_Jdl, cache_ 45 recalculated entirely, or just portions of it, and the new parts 

id2) 192 can be used to determine how similar the versions merged with parts from existing versions. The latter method 

of obj_id in cache_Jdl and cache_id2 are. The API is often more efficient than the former. 

update_cache (cache) 193 ensures that all items in the cache An OIB 10' for obj_Jd is created in the directory 130 for 

are current The API defin6_rclation (rclation_name, list„ the cache, if one doesn't already exist. If the cache previ- 

of_attributes) 194 identifies relations to the object manager. 50 ously contained no version of the obj^d, the cache is added 

When one or more records change, the object manager can to the storage_list 163 of obj_id. The version__Dum 141 

be informed of this via the APIs record_has_changed and timestamp 142 fields of the OIB 10* (FIG. 14) are set to 

(relation_name, list_of_attribute_values) 195 and the version_jium 161 and timestamp 162 fields of the 

records„have_changed (relarion_name, list_of_ dependency information block (DIB) 128 (FIG, 16). The 

attribute_conditions) 196. ss actual_weight field 143 of the OIB 10* (FIG, 14) is set to the 

Nodes in the object dependence graph G 121 are created sum_weight field 167 of the DIB. For each (o2, weight_act, 
via the API call to the objea manager createjode (obj^d, version_num) triplet belonging to the dep Jist 144 of the 
initiaL-version_num, thresh_weight) 181. Those skilled in OIB 10' (FIG. 14), the W6ight„act 152 is set to the weight 
the art will appreciate that many additional APIs can be 165 for the corresponding edge on the incoming_dep 164 of 
implemented in a straightforward manner within the spirit 60 the DIB. The version^num 153 is set to the vcrsion_num 
and scope of the present invention. For example, APIs can 161 field contained in the DIB for o2. In step 2060, it is 
be added for modifying the objected 160, version_num insured that consistency is preserved. This function recur- 
161, and threshold_weight 168 fields after a node has been sively insures that all noncurrent objects obj2 on the con- 
created, sistency fist 169 for obj_id are updated or invalidated 

Dependencies between existing nodes in the graph are 65 whenever the timestamp 142 in the OIB 10* for obj2 is 

created via an API call: add_dependency (source_object_ before the timestamp 162 in the DIB 128 for obj_Jd. If any 

id, target_object_Jd, dep_weight) 182. such objects obj2 are updated in this process, a similar 
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procedure is applied recursively to the consistency lists 169 
for each said obj2. The ordering of Steps 2050 and 2060 is 
not critical to the correctness of this embodiment. 

FIG. 20 depicts an example of the API, copy_object 
(obj_id, to_cache_id fi'om_cache _id) 186. As depicted, 5 
in step 2100 it is verified that the obj_id, to_cachc_id, and 
from_cache_id parameters are all recognized by the object 
manager. If so, in step 2110 it is determined if from_cach6_ 
id has a copy of obj_id. If not, nothing happens and 
processing proceeds to step 2010. A status variable is set lO 
appropriately for this (and other cases as well) and is 
returned to the application program to indicate what hap- 
pened. Otherwise, processing continues to step 2120, in 
which it is determined if to_cache_id and from_cache_id 
include identical versions of obj_id. If so, no copying needs 15 
to take place, and processing continues to step 2010. 
Otherwise, step 2130 determines if from_cache_id con- 
tains the latest version of the obj_id. If so, in step 2140, the 
object is copied to the object storage 132 area of to_cache_ 
id and the cache directory 130 is updated. An GIB 10' for 20 
obj_id is created in the directory 130 for to_cache_id if 
one doesn't already exist. If to_cathe__id previously con- 
tained no version of obj_id, to__cache_id is added to the 
storage_list 163 of obj_id. In step 2170, consistency is 
preserved by insuring thai all noncurrent objects on consis- is 
tency lists 169 with OIB time stamps 142 prior to the DEB 
timestamp 162 of obj_id are either updated or invaUdated. 
Otherwise, if the result of step 2130 is negative, in step 2150 
the object will not be allowed to be copied unless: (1) all 
objects on the consistency list 169 for obj_id for which 30 
noncurrent versions are stored in to_cache_id have the 
same timestamp 142 as the timestamp 142 for the version of 
obj_id in from_cache_id; and (2) all objects on the con- 
sistency list 169 for obj_.id for which current versions are 
stored in to_cache_id have the same or earlier timestamp 35 
142 as the timestamp 142 for the version of obj_id in 
from_cache_id. If these conditions are satisfied, in step 
2160 obj_id is copied from from_cache^d to to_cache_ 
id. 

An OIB 10' for the obj_id is created in the directory 130 40 
for to_cache_id if one doesn't already exist. If to_cache_ 
id previously contained no version of obj_id, to_cache_id 
is added to the storage _J\sl 163 of obj_id. 

A straightforward extension to the copy_object and 
cachc_;„^,^vcrsion APIs would be flags which could pre- 45 
vent an object from being stored if other objects on the 
consistency list would also need to be updated. Another 
straightforward extension would be additional flags which 
woiild only place the object_id in a cache if the cache did 
not include any version of the object_id. 50 

Another straightforward extension would be a system 
where the object manager maintained all previous versions 
of an object. We could then have APIs for adding a specific 
object identified by a particular (object_id, version_num) 
pair to a cache. ss 

Objects are deleted from a cache via the API call: delete_ 
object (obj_Jd, cadie) 187. One skilled in the art will 
appreciate that it is straightfonvard to implement this func- 
tion in accordance with this It detailed description. An 
example of a function performed by this API call is the 60 
removal of cache from the storage_list field 163 of die 
dependency infonnation block (DIB) 128 (FIG. 16) for the 
object identified by obj_id. 

An application program which changes the value of 
underlying data must inform the object manager. TVvo API 65 
calls for achieving this are: objectJias_changed (obj^d) 
188 where the obj_id parameter identifies a graph object; 



and objects_Jiave_changed (list_of_objects) 189 where 
the list_of_objects parameter includes a list of (pointers to) 
graph objects. 

If the graph objects on list_of_object affect many other 
graph objects in common, the objects_Jiave_changed API 
will be more efiBcient than invoking the objcct_has_ 
changed API, once for each graph object on a list. 

FIG. 21 depicts an example of the API, objccts_Jiavc_ 
changed (list_of_objects) 189. Those skilled in the art will 
appreciate that it is straightforward to then implement the 
API, object_has_changed (obj_id). 

For ease of exposition, we assume that each clement of 
list_of_objects corresponds to a valid node in G and that no 
two elements on the list_of_objects refer to the same node. 
It would be straightforward to adapt this function from the 
detailed description for situations where this is not the case. 
As depicted, in step 2400 increment the counter num„ 
updates 125 (FIG. 12c) by 1. In step 2402, it is determined 
if all nodes corresponding to the graph objects specified by 
the list_of_objects parameter have been visited. If so, in 
step 2403, the update propagation phase (see FIG. 22) is 
followed, in step 2404, by the consistency check phase (see 
FIG. 26). If not, in step 2405, a new node corresponding to 
a graph object on the Ust_of_objects is visited. Let obj_id 
be the object_id 160 for the node. The object manager 
increments the version_num field 161 in the dependency 
information block (DIB) 128 (FIG. 16) for obj_id by 1 and 
sets the timestamp field 162 to the value of num_updates 
125. Steps 2406 and 2408 represent a loop which notifies 
each cache cl containing obj_id (obtained from storage_ 
list 163) to update or invalidate its version of obj_id. In step 
2406, a function update_or_invalidate (cl, obj_id) (see 
FIG. 25) is invoked to cause this to happen. 

Those skilled in the art wifl appreciate that it is straight- 
forward to apply selectivity in step 2406 in deciding which 
caches must update or invalidate their copies of obj_id. 

FIG. 25 depicts an example of the update_or_invahdate 
(cacheid, objectid) logic. It is called whenever the version of 
objectid currently in cacheid must either be updated or 
invalidated (see e.g., step 2406, FIG. 21). As depicted, in 
step 2407 it is determined whether the objectid should be 
updated in the cacheid. If the answer is no, the objectid is 
invalidated from the cache in step 2440 and the procedure 
returns, in step 2441. If the answer is yes, in step 2442 the 
following changes are made to the OB 10' (FIG. 14) for 
objectid: 

1. The version_num 141 and timestamp 142 fields are set 
to the cut version_num 161 and timestamp 162 fields 
contained in the dependency information block (DIB) 128 
(FIG. 16). 

2. The acmaL-weight field 143 is set to the sum_weight 
field 167 in the DIB. 

3. The dep_list 144 (FIG. 15) is updated. Each member 
of the list 144 corresponds to a graph object o2 which has a 
dependency to the object identified by objectid. The 
weight_act 152 is set to the weight 165 field in the depen- 
dency information block (DIB) 128 (FIG. 16) corresponding 
to the same edge in G if these two quantities differ. In 
addition, version_nim3 153 is set to the version_num field 
161 contained in the DIB for o2 if these two quantities differ. 

In step 2444, the actual value of objecdd contained in the 
object storage area 132 is updated. First, an attempt is made 
to obtain the updated version of objectid from the latest_ 
object field 1601 in the dependency information block (DIB) 
128 (FIG. 16). If this succeeds, step 2444 is over. If this fails 
(i.e., this pomter is nil), the updated version of objectid is 
calculated, e.g., by either calculating the new version of 
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objectid entirely or just recalculating portions of it and a depth first manner, in response to changes to underlying 

merging the new parts with parts from existing versions. The data (dfs). Suppose an edge from a first node objl to a 

latter method is often more efficient than the former. In either second node obj2 has just been traversed. In step 2416, it is 

case, the object manager then has the option of updating the determined if the node obj2 has been visited yet. The answer 

latest__object field 1601 in the DIB so that other caches 5 is yes if and only if the timestamp 162 (FIG. 16) for 

which might need the latest version of the objected can obj2=num_updates 125 (FIG. 12). 

simply copy it instead of recalculating it. If the result from step 2416 is true, processing continues 

In some cases, in step 2444 the actual value of the objectid at step 2417. This step is part of a loop where all caches on 

can be updated with a later version of the objectid, prefer- storage_list 163 (FIG. 16) are examined to see if they 

ably the latest easily accessible one (which would generally lO include a copy of obj2. Recall that each object preferably has 

be the cached version with the highest_version_num 141) an object_Jd field and a version_number field. The object_ 

which is not actually current. This is advantageous if cal- id field corresponds to something which an application 

culating the current value of objectid is prohibitively expen- program would use to identify the object (e.g., the URL), 

sive. Preferably, this type of update would not be allowed if while the version number field allows multiple objects with 

either ofthe following are true: is the same object_Jd to be maintained and uniquely identi- 

1. the objectid is one of the objects on the list passed to fied. For each such cache cacheid, in step 2420 it is deter- 
objects_have_changed (list_of_objects); or mined if the version of obj2 is current by comparing the 

2. For the later version of objectid, it is still the case that version_num field 141 in the OIB 10' (FIG. 14) with the 
actuaL-weight 143 <threshold__weight 168. V6rsion_num field 161 in the dependency information block 

In step 2443, (object_id 160, cacheid) pairs are added to 20 (DIB) 128 (FIG. 16). If the result from step 2420 is 

the consistency stack 128.5 (FIG. 12) for each object on the affirmative, in step 2421 it is ensured that on the dep_list 

consistency_list 169 which is in the cache__identified by 144 for obj2, the element corresponding to objl has a 

cacheid. The object manager 120 makes sure that all cached version_num 153=version_num 161 in the DIB for objl. 

items on the consistency stack 128.5 are consistent in the If the result from step 2420 is negative, i.e., the version of 

consistency check phase (FIG. 26). 25 obj2 is not current, a function decrease_weight (cacheid. 

The consistency stack could be implemented in several objl, obj2) is invoked (See FIG. 24). Recall that each edge 

fashions; two possible structures are lists and balanced trees can have a non negative number associated with it known as 

(Reference: Aho, Hopcrokt, UUman). Lists have the advan- the weight which represents the importance of the data 

tage that insertion is constant time. The disadvantage is that dependence. For example, high numbers can represent 

duplicate copies of items could end up on them. Trees have 30 important dependencies, while low numbers represent insig- 

the advantage that no duplicate items need be stored. The nificant dependencies. Recall also that objects can also have 

disadvantage is that insertion is 0(log(n)), where n is the a value known as the thr6shold_weight associated with 

number of items on the consistency stack. them. Whenever the sum of the weights corresponding to 

Step 2443 may optionally apply more selectivity before incoming data dependencies which are current falls below 

adding an object to the consistency stack. Let object_id2 be 35 the threshold_weight, the object is considered to be highly 

an object on the consistency list 169 which is in cacheid. If obsolete. Such objects should be updated or invalidated for 

cacheid contains a current version of object_id2, (object appfications requiring recent versions of objects. 

id2, cacheid) doesn't have to be added to the consistency If the result of step 2416 is false, in step 2423 the 

stack. The version is current if both ofthe following are true: version_num field 161 for obj2 is incremented and the 

1. The vertex corresponding to object_id2 has akeady 40 timestamp field 162 is set to num_updates 125 (FIG. 12) 
been visited in processing the current call to objects_have_ which indicates that obj2 has been visited. Step 2424 is part 
changed (Iist_of_objects) 189. This is true if and only if the of a loop where all caches which on the storage_list 163 are 
timestamp field 162 in the dependency infonnation block examinedtoseeif they include a copy of obj2. For each such 
(DIB) 128 (FIG. 16) for object^j id2 is equal to num_ cache cacheid, in step 2425 the decrease_weight (cacheid, 
updates 125; and 45 objl, obj2) functign is invoked After this loop exits, in step 

2. The versioo_num field 141 in the OIB lO" (FIG. 14) 2426 the dfs logic (FIG. 23) is recursively invoked on all 
and 161 in the DIB for objcct_id2 are the same. outgoing edges from obj2. 

If step 2443 determines that both (1) and (2) are true, FIG. 24 depicts an example of the decrease weight 

(object_id2, cacheid) is not added to the consistency stack. (cacheid,from_obj, to_obj) logic. As depicted, in step 2425 

If (1) is true but (2) is false, step 2443 could recursively so the actuaL_weight field 143 for to_obj is decremented by w 

invoke update_or_Jnvalidate on object_id2 and cacheid where w is the weight^act field 152 corresponding to the 

which would obviate the need for adding (object_id2, edge from from_obj to to_obj. In step 2435, it is deter- 

cache_id) to the dependency stack. mined if the actual_weight 143 <threshold_weight 168; if 

One skilled in the art could easily implement Steps the answer is yes, the function update_or_invalidate cac- 

2442,2443, and 2444 in any order or in parallel from the 55 heid (cacheid, to_obj) is invoke If the answer is no, in step 

description. 2436 the weight_act field 152 is set corresponding to the 

FIG. 22 depicts an example of the update propagation edge from from_obj to to_obj to 0. 

phase for objects_Jiave_changed (list_of_objccts) 189. After the update propagation phase, the object manager 

The basic function performed by Steps 2403 and 2416 is to must ensure that the consistency_Jists 169 are in fact 

traverse all edges of the graph G accessible from the 60 consistent. This is done in the consistency check phase 

list_of_objects. The preferred technique is analogous to a depicted in FIG. 26. As depicted, step 2404 is part of a loop 

depth-first search ("dfs") (reference: Aho, Hopcroft, which examines each (object_id 160, cache^id 135) pair in 

Ullman). However, one skilled in the art could easily adapt the consistency stack 128.5 (FIG. 12c). For each such pair, 

the technique to work with other graph traversal methods in step 2451 it is determined if the version of object_Jd in 

such as a breadth-first search. 65 the cache cache_id is current by comparing the version_ 

FIG. 23 depicts an example of a part of a method for num field 141 with the version_num field 161. If the answer 

propagating changes through the object dependence graph in is yes, processing returns to step 2404., Otherwise, the 
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object must either be updated or invalidated. In step 2455 it initial_version_num, thresh_weight, relation_naine, list_ 

is determined whether the object should be updated. If the of_attribute_values) 190 (FIG. ISa). Referring now to FIG. 

answer is no, the object is invalidated in step 2440 described 27, in step 2300 it is detennined if all input parameters are 

earher (see FIG. 25). If the answer is yes, the latest value is valid (e.g., they arc of the right type, etc). It is also verified 

added to the cache in step 2050 and the new consistency 5 that the relation "relalion_Dame'* was previously defined via 

constraints are satisfied in step 2060 which were both a call to defin6_rclation 194 by examining the relation info 

described earlier (see FIG. 19). The ordering of steps 2050 area 129. It is also verified that the list_of_^^^„,^values 

and 2060 is not critical to the correctness of this example. contains the connect number of values and that aS values are 

Another API, update_cache (cache) 193, ensures that all of the correct type. It is also verified that a node for obj_id 

items in the cache are current. It docs so by examining the lO or a node corresponding to the same record doesn't already 

OIB's for every object in the cache and invalidating or exist (it would be easy to modify the design so that if a node 

updating obsolete items. It ignores consistency lists because for obj_id already existed, the old node would be overwrit- 

all objects will be current and therefore consistent after the ten. It would also be easy to modify the design so that 

function completes. multiple nodes with the same obj_id could exist. It would 

Relations 15 also be easy to allow multiple nodes to correspond to the 

The present invention has special features for handling same record. If it is determined that all parameters are valid, 

records (These records are not synonymous with records processing continues with step 2305. Otherwise, create_ 

used in the preferred embodiment) which may be part of a sro_node returns at step 2320 with an appropriate status 

relational database (see *TJndcnstanding the New. SQL: A message. 

Complete Guide" by J. Melton and A. R. Simon, Morgan 20 In step 2305 a new node is created in G by initializing the 

Kaufraann, 1993), objcct__id 160 to obj_id; version jum 161 to initial^ 

For example, suppose that a relation rel name has the version__num; threshold_weight 168 to thresh_weight; and 

attributes age and weight, both of type integer. For the relational_string 1602 to relation_name concatenated with 

following: rel_name (age=25, wei^t=34) represents a all of the attribute values. The relation and attribute values 

single record; while reLname (age =25) is a multirccord 25 comprising relationai_slring 1602 are preferably all sepa- 

specifier (MRS) and represents all records belonging to rated by delimiters. That way, it is easy to identify the 

rel_name for which age=25. relation and each attribute value easily from the relational_ 

The present invention has features allowing objects which string 1602. A multiple__records 1603 field (FIG. 16) is set 

correspond to either single or multiple lY- records to be to false. In step 2310, a pointer to the node is added to the 

managed. Such objects are known as relational object& A 30 SRT. The position of the new pointer in the SRT is deter- 

single object can represent multiple records from the same mined from relational_string 1602. In step 2315 dependen- 

relation. Such an object is known as a multiple record object cies are added from the obj_id to each multiple record 

(MRO) while an object corresponding to a single record is object (MRO) containing it Such MRO's are found by 

known as a single record object (SRO). An MRO objl examining the multiple record tree MRT 122. The MRT is 

contains (includes) another relational object obj2 if the set of 35 preferably a balanced tree which contains pointers to all 

records corresponding to obj2 is a subset of the set of records MRO nodes in G and is indexed alphabetically by the 

corresponding to objl. The object manager automatically relationaL_string field 1602 in the dependency information 

adds dependencies from a relational object to an MRO block (DIB) 128 (FIG. 16). It is only necessary to examine 

which contains it MRO's for relation_name. All such MRO's can be identi- 

Tbe object manager maintains a balanced tree known as 40 fled in O(log (n)+m) instructions where n is the total number 

the multiple record tree (MRT) 122 which contains pointers of MRO's and m is the number of MRO's for the relation., 

to all MRO nodes in G and is indexed alphabetically by the name. 

relationaL-String field 1602 in the dependency information For each MRO "obj2 _id" containing obj_id, a depen- 

block (DIB) 128 (FIG. 16). A balanced tree known as the dency from obj_Jd to obj2 ^d is created, 

single relation tree (SRT) contains pointers to all SRO nodes 45 Referring again to FIG. 16, the dependency is preferably 

in G and is also indexed alphabetically by the relational_ initialized with a weight of the mro_dep_weight 1604 for 

string field 1602 in the DEB. An alternative approach which obj2_id. The threshold weight 168 for obj2_id is incrc- 

is easy to implement fiom this description would be to mented by mro_threshold .Jncrement 1605 for obj2_id. A 

maintain a single balanced tree for both single and mxdtiple straightforward extension to the algorithm would be to use 

relations. Another variation would be to use data structures 50 other techniques for assigning weights to the dependency 

other than balanced trees for maintaining this information. and modifying the threshold_weight 168. Retiu:ning now to 

According to the present invention, before a relational FIG. 27, in step 2320, the process returns with a status 

object is created, the relation must be identified to the object message. The order of steps 2305, 2310, and 2315 can be 

manager via the API: define_relation (reladion_name, list_ varied. Furthermore, these steps can be executed concur- 

oL_attributes) 194. 55 rently. 

Each element of the list_of_attributes argument is a pair FIG. 28 depicts an example of logic for creating multiple 

containing the name and type of the attribute. The API record objects (MROs). MRO's are created via the API: 

dcfinc_relation 194 stores information about the relation in creatc.jnro_node (obj_id, initial_v6rsion„num, thresh_ 

the relation info area 129 (FIG, 12). weight, relatioQ_name, lisi_oL_attribute_conditions, reL 

HG. 27 depicts an example of the logic for creating a 60 default_weight, rel_default_thrcshold) 191 (FIG. 18fl); 

node corre^onding to a single record object (SRO). Recall attribute conditions are of the form:o25;>96;>45 and <100; 

that an object corresponding to a single record is looown as etc. An attribute condition can also be null, meaning that 

a single record object (SRO). A balanced tree known as the there is no restriction on the attribute value, 

single relation tree (SRT) contains pointers to all SRO nodes Recall that a single object can represent multiple records 

in G and is also indexed alphabetically by the relational^ 65 from the same relation. Such an object is known as a 

string field 1602 in the DIB (FIG. 16). Anode corresponding multiple record object (MRO) while an object corresponding 

to an SRO is created via the API ceate_sro_node (obj^d, to a single record is known as a single record object (SRO). 
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An MROobjl contains another relational object obj2 if the variant could be applied to Steps 2315, 2615, or 2625. 

set of records corre^onding to obj2 is a subset ofthe set of Another alternative would be to selectively add dependen- 

records corresponding to objl. The object manager auto- cies between MRO's when neither MRO is a subset of the 

matically adds dependencies from a relational object to an other but the two MRO*s have one or more records in 

MRO which contains it. The object manager also preferably 5 common. 

maintains a balanced tree known as the multiple record tree Returning now to FIG. 16, those skilled in the art will 

(MRT) 122 which contains pointers to all MRO nodes in G appreciate that within the spirit and scope of the present 

and is indexed alphabetically by the relationaL_5tring field invention APIs can be added to modify the relationaLjstring 

1602 in the dependency information block (DIB) 128 (FIG. 1602, multiple_records 1603, mro_dep_weight 1604, and 

16). A balanced tree known as the single relation tree (SRI) lO mro_threshold_jncrement 1605 for a relational object after 

contains pointers to all SRO nodes in G and is also indexed the object has been defined via the create_sro_node 190 or 

alphabetically by the relational_5triDg field 1602 in the the create__„„^ode 191 APIs. 

DIB. When one or more records change, the object manager 

As depicted, in step 2600, it is determined if all input can be informed of this via the APIs (FIG. ISa) record_ 

parameters are valid (e.g., they are of the right type, etc). Id i5 has_changed (relation„name, list_of_attribute„values) 

addition, it is verified that the relation "relation_oame" was 195 and records_,have_changed (relation_name, list„of_ 

previously defined via a call to define„relation 194 API attribute_.oonditions) 196. These APIs automatically propa- 

(FIG. ISa) by examining the relation info storage area 129 gate changes throughout the dependence hierarchy. 

(FIG. 12). It is also verified that the list_of_attribiit6_ FIG. 29a depicts an example of how the rccords_have__ 

conditions is valid; and that a node for obj_id or a node 20 changed (relation_name, list_of_attribute_conditions) 

corresponding to the same set of records doesn't already 196 API can be implemented. Those skilled in the art will 

exist. Those skilled in the art will appreciate that it would be appreciate that it is straightforward to implement the 

easy to modify the design so that if a node for obj_id already record__has__changcd (relation_name, list_of_attribut6_ 

existed, the old node would be overwritten. It would also be values) 195 API therefrom. 

easy to modify the design so that multiple nodes with the 25 As depicted, in step 2700 it is determined if the input 

same obj^d could exist It would also be easy to allow parameters are valid. It is also verified that the relation 

multiple nodes to correspond to the same set of records. If relationjame was previously defined (via a call to the 

the result of step 2600 is a determination that all parameters define_relation 194 API (FIG. 18a)) by examining the 

are valid, processing continues with step 2605. Otherwise, relation info area 129 (FIG. 12). It is also verified that the 

create_mro_node returns at step 2620 vnih an appropriate 30 list_of_attribute_conditions is valid. If the input param- 

slatus message. eters are vahd, processing proceeds to step 2710, Otherwise, 

In step 2605, (with reference also to FIG. 16) a new node in step 2730 the procedure is aborted with an appropriate 

is created in G (FIG. 17) by initializing the objecl_id 160 to status message. 

obj_id, vcrsion_num 161 to initial_version_num, In step 2710, all relational objects are found which 

threshold__weight 168 to thresh_weight, and relational_ 35 include at least one record which has changed This can be 

string 1602 to rcladon_nam6 concatenated with all of the done by examining all relational objects on the MRT 122 

attribute conditions. The relation and attribute conditions and SRT 123 (FIG. 12) which correspond to the relation_ 

comprising the relationaLjstring 1602 are all separated by name. In step 2720, the changes can be propagated to other 

delimiters. That way, it is easy to identify the relation and nodes in G by invoking the objects^ave_changed 189 API 

each attribute condition easily from the relational_string 40 on the list of all objects identified in step 2710. 

1602. The multiple_records 1603 field is set to true; the Finally, in step 2730, records _have_changed returns an 

mro_„dep_weight 1604 is set to rel_default_weight, and appropriate status message. 

the mro_threshold_increment 1605 is set to rel_default_ A straightforward variant of the records_have_changed 

threshold. API would be to consider the proportion and importance of 

In step 2610, a pointer to the node is added to the MRT. 45 records in a relational object which have changed in dcter- 

The position of the new pointer in the MRT is determined by mining how to propagate change information throughout G. 

relationaL-String 1602 . In step 2615 dependencies are The API comparc_objects (obj^d, cache _idl, cach6_ 

added from ob_id to each MRO containing it, in the same id2) 192 (FIG. XSb) can be used to determine how similar the 

manner as step 2315. versions of obj_id in cache_idl and cache ^d2 are. For 

For each object obj2^d contained by obj^id, in step 50 example, the version ^num 141 fields can be compared to 

2625 a dependency is added from obj2_id to obj_id. Such see if the two versions are the same, if they are different, an 

dependent objects are found by searching both the MRT 122 indication can be provided of how much more recent one 

and SRT 123 and considering all other relational objects for object is from the other, for example, by the difference in the 

relation_name. Each dependency is assigned a weight ofthe version_num 141 and timestamp 142 fields (FIG. 14), 

mro_dep_weight 1604 for obj_id. For each such 55 If the two versions of the object are different, a similarity 

dependency, the threshold_weight 168 for obj_id is incre- score can be computed ranging from 0 (least similar) to <1 

mented by the mro_lhreshold_Jncremenl 1605 for obj^id. (1 would correspond to identical versions of the object). The 

Those skilled in the art will appreciate that other techniques similarity scores arc preferably based on the sum of weights 

can be used for assigning weights to the dependency and of incoming dependencies to obj_id from graph objects 

modifying the threshold_wcight 168. In step 2620, create_ 60 obj^d2, for which the version of obj_id2 consistent with 

mro_node returns with a status message. The order of steps obj_id in cache_idl, is identical to the version of obj_id2 

2605, 2610,2615, and 2625 can be varied. Furthermore, consistent with obj^d in cachc_id2. The similarity score 

these steps can be executed concurrently. (SS) can be calculated using the formula: 

Alternatively, the weight of a dependency from a rela- SS=«common__weight/sum_weight 167 where common_ 

tional object objl to an MRO obj2 which contains it could 65 weight^um of weight 165 corresponding to edges from 

be based on the proportion and importance of records graph objects obj_id2 to obj_id where the version_num 

corresponding to obj2 whidi are also contained in objl. This 153 fields corresponding to the edges are identical for both 
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versions of obj__id. The compare_objects logic can also be 
used to determine whether the two versions are highly 
dissimilar or not. They are highly dissimilar if and only if 
common__weight<thrcshold_weight. 
Extensions 

A straightforward extension to the present invention 
would be to include lhreshold_weight fields in OIBs (FIG. 
14) and to let caches 2' (FIG. 13) set these fields indepen- 
dently. Another straightforward extension to would be to 
allow different consistency lists for the same objea corre- 
sponding to different caches. 

A further extension would be a system where multiple 
dependencies from a graph object objl to another graph 
object obj2 could exist with different weights. Application 
programs could independently modify these multiple depen- 
dencies. 

Still another extension would be to use other algorithms 
for determining when an object is obsolete based on the 
obsolete links to the object 

When a graph object changes, the preferred embodiment 
does not consider how the graph object changes when 
propagating the information through the dependence graph 
G. It only takes into account the fact that the graph object has 
changed. An extension would be to also consider how a 
graph object changes in order to propagate the changes to 
other graph objects. This could be done in the following 
ways: 

1. By providing additional information about how a graph 
object has changed via parameters to functions such as the 
object_has_changed. lliis information would be used to 
modify links from the graph object to other graph objects 
which depend on its value and would be subsequently used 
to determine how successors to the graph object have 
changed. 

2. When the object manager 120 determines that a graph 
object o2 has changed, the object manager could consider 
both: which predecessors of it have changed; and any 
information that it has recursively collected on how the 
predecessors have changed. The object manager would then 
use this information to determine how o2 has changed The 
information on how o2 has changed would be used to 
modify links to other graph objects which depend on o2 and 
would be subsequently used to determine how successors to 
o2 have changed. 

For example, consider FIG. 29b. u2 and u3 are underlying 
data which have changed. The object manager propagates 
the change information to ol and o3. When the object 
manager propagates change information to o2, it not only 
considers the weights of the edges from ol and o3 to o2 in 
determining how to update or invalidate cached copies of o2. 
It also considers the nature of the changes to u2, u3, ol, and 
o3. This information may also be used to determine how to 
update or invalidate cached versions of o4. 

Other Applications 

The present invention can also be used in a system where 
an application has to make a decision on whether or not to 
update underlying data. By examining the object depen- 
dence graph, the system can determine the other objects 
affected by the changes to the underlying data. If this set is 
satisfactory, the changes could be made. Otherwise, the 
system could refrain from making the changes to the under- 
lying data. 

Those dolled in the art will appreciate that the present 
invention could also be used by a compiler, run-time system, 
or database in order to efficiently schedule operations. 
Different schedules could result in different changes to 
underlying data. By analyzing the object dependence graph, 
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the program making scheduling decisions could determine a 
favorable method to schedule operations. 
Detailed Description of a Scaleable Method for Maintaining 
and Consistently Updating Caches 
This embodiment of the present invention is designed to 
function on a collection of one or more physical (computer) 
systems connected by a network. There may be more than 
one instance of the present invention residing in this col- 
lection of systenas. Although dictionary are also implied, the 
following definitions are provided for guidance to distin- 
guish among mxiltiple instances of the present invention. 

Object Sources. Object Sources include one or more 
products such as are sold by IBM under the trademark D62 
and by Lotus under the trademarks LOTUS NOTES and 
DOMINO Server, or Other Sources 3030 including data or 
objects from which more complex objects (such as HTML 
pages) are built. 

Trigger. Any means which can be used to cause actions to 
occur automatically in response to modification in the data. 
A trigger is a standard feature of many standard Object 
Sources such as are sold by IBM under the trademark DB2 
and by Lotus under the trademarks LOTUS NOTES IF and 
DOMINO Server to cause actions to occur automatically in 
response to modification in the data. One embodiment of the 
present invention uses triggers in a novel way to keep 
objects built from data stored in an Object Source synchro- 
nized with the data. 

Trigger Notification. This is a message sent to the present 
invention in response to a trigger being invoked within an 
30 Object Source. 

Cache transactions. Include requests to a cache manager 
to read, update, or delete cache objects. 

Trigger Monitor. An example of logic in accordance with 
the present invention for keeping the objects in a cache 
35 managed by a Cache manager synchronized with associated 
remote data. The Trigger Monitor can be a single long 
running process monitoring remote data sources for the 
purpose of keeping complex objects stored in a cache 
managed by a Cache manager synchronized with the under- 
40 lying data. 

Master Trigger Monitor. This an instance of a Trigger 
Monitor which receives Trigger Notifications. 

Slave Trigger Monitor. This is an instance of a Trigger 
Monitor to which Trigger Notifications are forwarded from 
a Master trigger monitor 3000' (that is; not from Object 
Sources directly). 

Local Cache. This is a cache (or other standard object 
store such as a file system) which is updated by an instance 
of a Trigger Monitor residing on the same physical machine 
as the cache itself. 

Remote Cache. This is a cache (or other standard object 
store such as a file system) which is updated by an instance 
of a Trigger Monitor residing on a different physical 
machine from the cache itself. 

It is possible for the present invention to play the role of 
both Master 3000 (if it receives trigger events) and Slave 
3000a (if it receives notifications of trigger events from 
some master). 

Referring now to the drawings, FIG. 30a depicts a block 
diagram example of a system having features of the present 
invention. As depicted, the system includes (one or more) 
remote nodes 3108. The nodes 3108 can be servers provid- 
ing Web pages to clients via Web servers (denoted as httpd 
3080). Each Web server can provide a significant percentage 
65 of dynamic Web pages which are constructed from a data- 
base 3010. Each such server node 1001 because of the cost 
involved in generating Web pages, caches one or more 
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objects 3004 including complex objects such as dynamic 
Web pages. Multiple requests for the same dynamic page 
can be satisfied from the cache 3003, thus reducing over- 
head. 

The use of multiple server nodes 3108 increases the 5 
volume of requests that the system can service. It is possible, 
although not a requirement, that the servers nodes 3108 can 
be separated geographically by long distances. 

In accordance with the present invention, when a change 
to an object source such as the database 3010 occurs which lO 
might affect the value of one or more objects 3004 stored in 
a cache 3003, a trigger monitor 3000 notifies each cache 
manager 3001 of the objects whose values have changed. 
The trigger monitor 3000 might inform a cache manager 
3001 that an object 3004 in its cache 3003 has changed. In. 15 
this case, the cache manager 3001 could invalidate its copy 
of the object 3004. Alternatively, the trigger monitor 3000 
could inform a cache manager 3001 that an object 3004 has 
changed and also provide the new value of the object 3004. 
Those skilled in the art will appreciate that the new value for 20 
the object 3004 could be computed on the data server node 
3102 as well as the remote node 3108 or some intermediate, 
e.g, proxy node. In either alternative case, the cache man- 
ager would also have the option of dynamically updating the 
object 3004, e.g., storing the new version, without having to 25 
invahdate it. 

FIG. 20b depicts a more detailed example of the Trigger 
Monitor 3000. Here, the Trigger Monitor 3000 is instanti- 
ated as a Ma.ster Trigger Monitor 3000'. As depicted, the 
maintenance of caches 3003 including complex object 3004s 30 
is done by a process (or collection of processes) according 
to the present invention called the Trigger Monitor 3000. 
The Trigger Monitor 3000 is preferably a single long run- 
ning process monitoring data sources 3050 for the purpose 
of keeping the contents of a Cache manager 3001 synchro- 35 
nized with the underlying data. A Master trigger monitor 
3000' is an instance of a Trigger Monitor 3000 which 
receives TCgger Events 3020. The Master Trigger Monitor 
3000' includes: a Trigger Monitor Driver 3040; Object Id 
Analysis 3041 logic; Object Generator 3042 logic; and a 40 
Distribution Manager 3043. 

The Master Trigger Monitor 3000' works in conjunction 
with Object Sources 3050, cache manager 3001 (known as 
a local cache manager), and zero or more other (Slave) 
Trigger Monitors 3000" (FIG. 30c) and a remote cache 45 
manager 3002, which reside on other physical machines. 
Object Sources 3050 include one or more entities; for 
example a database 3010 such as is sold by IBM Corp. under 
the trademark DB2; or any Other Sources 3030 such as a 
server sold by Lotus Corp. under the trademark DOMINO, 50 
from which more complex objects (such as HTML pages) 
are built. 

When an Object Source 3050 detects a change, a trigger 
is invoked. The trigger, which is a standard feature of many 
standard Object Somces 3050 such as the above, is typically 55 
used to cause actions to occur automatically in response to 
modification of the data. The present invention uses triggers 
in a novel way to keep object 3004 built from data stored in 
an Object Source synchronized with the data. Associated 
with the trigger is a send trigger 3026 API (sec FIG. 30d) 60 
which causes a message to be sent to the Trigger Monitor 
Driver 3040. In response, the Trigger Monitor Driver 3040 
can then generate a transaction (see FIG. 30^) called a 
Trigger Event 3020. 

The Trigger Event 3020 can be translated (by conven- 65 
tioaal means) into a Record ID 3012 and forwarded to a 
Cache Manager 3001 for translation. The Cache Manager 
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3001 returns a corresponding list of Object IDs 3009 which 
are enqueued to the Object Id Analysis (OIA) -component 
3041. The OIA 3041 generates, by well known means, a set 
of Object Disposition Blocks (ODB) 3100 (described 
below), one for each Object ID 3009. 

FIG. 31 depicts an example of the Object Disposition 
Block (ODB) 3100. The Object ID 3009 is used to identify 
an object 3004 in the cache 3003 when subsequently replac- 
ing or deleting the objects. The Cache Id 3200 is used to 
identify which of the caches 3003 the objects 3004 belongs 
in. The External ID 3101 is an additional identifier by which 
the Object Generator 3042 might know the object. The 
Request Disposition 3103 is used by the Object Generator to 
generate an Update Object Request 3022 or a Delete Remote 
Object Request 3025 (FIG. 30e). If the request disposition 
3103 is a DispRegenerate 3130, the objects 3004 repre- 
sented by the ODB 3100 are regenerated by the system and 
distributed. If the request disposition 3103 is a Displnvali- 
datc 3131, the objects 3004 are deleted from all systems. 

FIG. 32 depicts an example of the cache ID 3200. As 
depicted, the Cache ID-preferably includes a cache name 
3201, a cadie host 3202 identifier and cache port 3203 
identifier. 

Returning now to FIG. 30b, the ODB 3100 is sent to the 
Object Generator 3042. The Object Generator examines the 
ODB 3100 and does one of the following: a) generates a 
Delete Remote Object Request 3025; b) establishes connec- 
tions with the Object Sources 3050, rebuilds the object 3004, 
and creates an Update Object Request 3022. 

The TMD 3040 then passes the Delete Remote Object 
Request 3025 or the Update Object Request 3022 to the 
Distribution Manager 3043. 

The Distribution Manager 3043 establishes a connection 
with each configured Remote Cache Manager 3002 or Slave 
Trigger Monitor 3000" (FIG. 30c), and delivers each the 
request. If the request is a Forward Trigger Request 3021, 
the request is sent to the Slave Trigger Monitor 3000" (FIG. 
30fl). If the request is an Update Object Request 3022, the 
new object is sent to the Remote Cache manager 3001 via 
the cache object 410 API (FIG. 4). If the request is a Delete 
Rerqote Object Request 3025 the object 3004 is pm:ged from 
each Remote Cache manager 3001 via the delete_object 
420 API (HG. 4). 

FIG. 30c depicts another example of the Trigger Monitor 
3000. Here, the Trigger Monitor 3000 is instantiated as a 
Slave Trigger Monitor 3000". If the Master Trigger Monitor 
3000' is maintaining exactly one system, or if an object 3004 
is to be regenerated (that is, not deleted), it can be fiiUy 
maintained using the process described in FIG. 30fr. If the 
Trigger Monitor 3000 is maintaining multiple systems, it is 
possible that the object 3004 exists in some but not all 
caches. In particular, the object 3004 may not exist in the 
same cache as the Trigger Monitor 3000 which received the 
Trigger Event 3020. To. handle this case a Slave Trigger 
Monitor 3000" (FIG. 30c) is run on each configured node. 
As depicted, the Slave Trigger Monitor 3000" receives a 
Forward Trigger Request 3021. This is processed identically 
to a Trigger Event 3020 until it arrives in the Object 
Generator 3042. If the Object Disposition Block 3100 has a 
Request Disposition 3103 equal to DispRegenerate 3130, 
the request is discarded. If the Request Disposition 3101 is 
Displnvalidate 3131 a Delete Local Object Request 3023 is 
built and sent to the Slave's Local Cache. 

Referring again to FIG. 30fl, the trigger monitor 3000 is 
preferably embodied as a single long running process, 
monitoring the object sources 3050. One skilled in the art 
could easily adapt the present invention to consist of one or 
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more processes per component, some of which may overlap 
in time to improve throughput of the system. One skilled in 
the art could also easily adapt the present invention to use 
multiple threads of operation in a single process, each thread 
implementing one or more of the components, some of s 
which may overlap in time, if the underlying system pro- 
vides support for threaded processes. 

Conventional mechanisms such as multiphase conunit 
and persistent data objects are preferably used when receiv- 
ing Trigger Events 3020 and Forward Trigger Requests 3021 lO 
to provide a guarantee to the object sources 3050 that these 
requests, once delivered, remain in the system until comple- 
tion. Conventional mechanisms such as retry and mul- 
tiphase .commit are preferably used to provide a guarantee 
that enqueued outboimd requests (depicted in FIG. 30e) 15 
remain in the system until completion. 

The Object Id Analysis (OIA) component 3041 translates 
the Object IDs 3009 into Object Disposition Blocks 3100 
(FIG. 31). The OIA 3041 may be specified and interfaced as 
a configuration option, an API, or in any other standard way. 20 
One skilled in the art could easily build such a mechanism. 

If The Object Generator 3042 translates the information in 
an Object Disposition Block (3100) into the transaction 
types depicted in FIG. 30c and described below. The trigger 
monitor 3000 provides an interface to this component using 25 
configuration options, APIs, or any other standard technique. 
Examples of Object Generators 3042 are the products sold 
by: IBM under the trademark NET.DATA; Lotus Corpora- 
tion under the trademark DOMINO Server, or any Web 
server fi-om which HTML pages can be fetched. 30 

FIG. 30i/ depicts an example of the send trigger API. As 
depicted, the scnd_trigger 3026 API enables the Object 
Sources 3050 to communicate with the Trigger Monitor 
Driver 3040. The send_trigger 3026 API sends a message 
including suflBcient information (message parameters) to 35 
uniquely identify the trigger and construct a Trigger Event 
3020. One skilled in the art could easily define and specify 
that information using standard techniques (such as variable- 
length parameter lists). 

FIG. 30e depicts examples of transaction types used in 40 
accordance with the present invention. As depicted, several 
transactions 3020 . . . 3025 can be generated within the 
system: 

A Trigger Event 3020 is generated in response to receipt 
of a message sent via the scnd_trigger 3026 API. The 45 
Trigger Event 3020 is a structure which maintains sufficient 
information to translate the data sent by the send_trigger 
3026 API into one or more Show Dependent Object 
Requests 3024 and to properly track and guide itself through 
the system. 50 

A Forward Trigger Request 3021 is generated in response 
to receipt of a Trigger Event 3020 sent via the send_trigger 
3026 API. The Forward Trigger Request 3021 is a structure 
which maintains sufficient information to generate one or 
more Show Dependent Object Requests 3024 and to prop- 55 
erly track and guide itself through the system 

An Update Object Request 3022 is generated by the 
Object Generator 3042 to cause new objects to be distributed 
to Remote Cache Managers 3002 via the Distribution Man- 
ager 3043. The Update Object Request is a structure which 60 
maintains sufficient information to replace an object 3004 in 
any arbitrary cache 3003. 

A Delete Local Object Request 3023 is generated by the 
Object Generator to cause a local Cache 3003 to delete an 
object 3004, The Delete Local Object Request 3023 is a 65 
structure which maintains sufScient information to delete an 
object 3004 from the Local Cache manager 3001, 



A Show Dependent Object Request 3024 is generated by 
the Trigger Monitor Driver 3040 in response to a Trigger 
Event 3020 to request the dependency information from the 
Local Cache Manager 3001. The Show Dependent Object 
Request 3024 is a structure which maintains sufficient 
information to analyze a Trigger Event 3020 or a Forward 
Trigger Request 3021 and invoke the API show_ 
dependent_objccts 3024 to acquire Object IDs 3009 from 
the Local Cache Manager 3001. 

A Delete Remote Object Request 3025 is generated by the 
Object Generator 3042 to cause an object 3004 to be deleted 
from remote cache managers 3002 via the Distribution 
Manager 3043. The Delete Remote Object Request 3025 is 
a structure which maintains sufficient information to delete 
an object 3004 from an arbitrary cache 3003. 

RG. 33 depicts an example of a high-level organization 
and communication paths of the Trigger Monitor Driver 
3040 and the Distribution Manager 3043. The preferred 
organizarion consists of several independently executing 
threads of control: 

A Receiving Thread 3300 receives requests including 
Trigger Event 3020 and Forward Trigger Request 3021 and 
saves them to some persistent store. An Incoming Work 
Dispatcher Thread 3320 dequeues incoming requests from 
3300 and enqueues them for processing. A Cache Manager 
Communications Thread 3340 sends the Delete Local 
Object Request 3023 and Show Dependent Object Request 
3024 requests to the Local Cache Manager 3060. An Object 
Generator Tliread 3360 coordinates generation of the object 
requests: Delete Remote Object Request 3025; and Update 
Object Request 3022, and enqueues them for distribution. A 
Distribution Thread 3080 (which is a main component of the 
Distribution Manager 3043) dequeues requests from the 
Distribution Manager Queue 3370 and enqueues them to all 
outbound machines. The Outbound Transaction threads 
3395 contact remote machines and forward the work 
enqueued on the Machine Outbound Queues 3390. 

As is conventional, these threads can communicate via 
several FIFO queues: the Incoming Request Queue 3310; 
the Cache Manager Request Queue 3330; the Object Gen- 
erator Queue 3350; the Distribution Manager Queue 3370; 
and the Machine Outboimd Queues 3390 (one per distrib- 
uted cache). 

FIG, 34 depicts an example of the Receiving Thread 3300 
logic. As depicted, in step 3410, an incoming message 
(either the send trigger API 3026 or a Forward Trigger 
Request 3021) enters the system and is converted to a 
Trigger Event 3020. In step 3420, the message is written by 
the receiving thread-3300 to a persistent queue 3450 and 
enqueued in step 3430 to the Incoming Request Queue 3310. 
In step 3440, the request type is checked. In step 3460, if it 
is a Trigger Event 3020, a Forward Trigger Request 3021 is 
enqueued to the Distribution Manager Queue 3370. In step 
3490, the receiving thread 3300 returns to waiting 3490 for 
work. 

FIG. 35 depicts an example of the incoming Work Dis- 
patcher Thread 3320 logic. As depicted, in step 3510, the 
incoming work dispatcher thread 3320 dequeues the work 
request. In step 3520, a Show Dependent Object Request 
3024 is enqueues to the Cache Manager Request Queue 
3330. In step 3590, the receiving thread 3300 returns to 
waiting for work. 

FIG. 36 depicts an example of the Cache Manager Com- 
munications Thread 3340 logic. As depicted, in step 3610, 
the cache manager communications thread 3340 dequeues a 
next request and establishes communications with the Local 
Cache Manager 3001. In step 3023, if the request is a Delete 
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Local Object Request, in step 3650, the delete_object 420 
API is used to delete the object from the local cache 3003. 
In step 3024, if the request is a Show Dependent Object 
Request, in step 3620 the show_d6pcndent__objects 460 
API is used to fetch the Object IDs 3009. In step 3630, the S 
Object IDs 3009 are passed to the Object ID Analysis 3042 
component which builds an Object Disposition Block 3100. 
In step 3640, the Object Disposition Block 3100 is enqueued 
to the Object Generator 3043. Finally, in step 3690, the 
Cache Manager Communications Thread 3340 returns to lO 
waiting for work 3690. 

FIG. 37 depicts an example of the Object Generator 
Thread 3360 logic. As depicted, in step 3710, the object 
generator thread 3360 dequeues a next request from the 
queue 3350. In step 3720, the Disposition of the object is 15 
checked. If it is a Displnvalidate 3131 proceed to step 3750; 
if a DispRegenerate 3130 proceed to step 3730. In step 3730, 
the RequestType is checked. If it is a Forward Trigger 
Request 3021 proceed to step 3770; if it is a Trigger Event 
3020 proceed to step 3740. In step 3740, the Data Sources 20 
3050 are contacted to regenerate the objects 3004. The new 
objects 3004 are enqueued with an Update Object Request 
3022 to the Distribution Manager Queue 3370. The process 
then returns to step 3790 to wait for work. 

In step 3750, the RequestType is checked. If it is a 25 
Forward Trigger Request 3021 proceed to step 3780; if it is 
a Trigger Event 3020 proceed to step 3760. In step 3760, a 
Delete Remote Object Request 3024 is built and enqueued 
to the Distribution Manager Queue 3370. The process then 
returns to step 3790 to wait for work. 30 

In step 3770, the request is deleted from the system. The 
process then returns to step 37SW) to wait for work. 

In step 3780, a Delete Local Object Request 3023 is 
enqueued to the Cache Manager Request Queue 3330. The 
process then returns to step 3790 to wait for work. 35 

FIG. 38 depicts an example of the Distribution Manager 
Thread 3380 logic. As depicted, in step 3810 the Distribu- 
tion Manager Thread 3380 dequeues work from the Distri- 
bution Manager Queue 3370 and enqueues a copy of the 
request to each of the Machine Outbound Queues 3390. The 40 
process then returns to step 3790 to wait for work. 

FIG. 39 depicts an example of the Outbound Transaction 
Thread 3395 logic. There is one Outbound Transaction 
Thread 3395 for each machine participating in the distrib- 
uted update scheme. As depicted, in step 3910 the tread 45 
dequeues work from the Machine Outbound Queue 3390 
and checks the request type. In step 3920, if it is an Update 
Object Request 3022 or Delete Remote Object Request 3025 
the process continues at step 3920; if it is a Forward Trigger 
Request 3021, the process continues at step 3930. In step 50 
3930, if it is a Forward Trigger Request 3021 the process 
continues at step 3930. 

In step 3920 the remote Cadie manager 3001 is contacted. 
In step 3940, if the request is an Update Object Request 
3022, the cache_object API 410 is used to send the new 55 
objects 3004 to the remote cache manager 3002. The process 
then returns to step 3990 to wail for work. In step 3950, if 
the request is a Delete Remote Object Request 3025, the 
delele_object API 420 is used to delete the objects 3004 
from the remote cache manager 3002. Tlie process then 60 
returns to step 3990 to wait for work. 

In step 3930, the remote Trigger Monitor 3000a is con- 
tacted. In step 3960, the Forward Trigger Request 3021 is 
sent to the remote Ttigger Monitor 3000. The process then 
returns to step 3990 to wait for work. The process then 65 
returns to step 3790 to wait for work. 
Extensions and Variations 
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Other exits not iterated here may be required for full 
analysis of Trigger Events 3020 and translation into actions 
(such as Update Object Request 3022 or Delete Remote 
Object Request 3025), depending on the specific application 
of this invention. 

For example, referring now to FIG. 40: 

a) it may be useful to translate 4000 a single Trigger Event 
3020 into a set of multiple Show Dependent Object Requests 
3024 via an exit; 

b) it may be useful to modify or analyze 4010 an objects 
3004 as created by the Object Generator 3042, prior to 
enqueing that objects 3004 in an Update Object request 
3022; and 

c) it may be useful to write an objects 3004 to the file 
system instead of, or in addition to, writing the objects 3004 
to cache 3003. 

Another use of the Trigger Monitor 3000 would be to 
reuse its ability to generate and distribute objects for the 
purpose of handling objects which may not currently exist in 
cache: 

a) a prime_cach6 API 4020 could be used to generate and 
distribute an objects 3004 given an object ID 3009, regard- 
less of whether that objects 3004 is currently known to any 
cache 3003; and 

b) a globaL„delete API 4030 cotild be used to insure that 
some specific objects 3004 is removed from all caches 1 in 
the system without knowing whether that object actually 
exists anywhere. 

The Trigger Monitor 3000 may be implemented to 
enforce strict FIFO ordering and processing of requests, or 
to permit full asynchronous processing of requests, or to 
process requests according to any well known scheduling 
scheme, or any combination of the above. 
Maintaining Consistency 

As discussed herein before, whQe dictionary meanings are 
also implied by terms used herein, the following glossary of 
some terms is provided for guidance: 

A transaction manager is a program which manages state. 
Examples include: cache managers managing caches; data- 
base management systems such as DB2; and transaction 
processing systems such as QCS. 

A transaction is a request made by another program to a 
transaction manager. 

A state-changing transaction is a transaction which modi- 
fies state managed by the transaction monitor. Requests to a 
cache manager to read, update, or delete cache objects 
would constitute transactions. 

Reads and modifications of data are known as accesses. 

A lock is an entity which limits the ability of processes to 
read or write shared data. When a process acquires a read 
lock on a piece of data, other processes can access the data 
but no other processes may modify the data. When a process 
acquires a write or exclusive lock on the data, no other 
processes may read or modify the data. Several methods for 
implementing locks exist in the prior art. See e.g., "Com- 
puter Architecture: A Quantitative Approach," 2nd edition, 
by Hennessy and Patterson, Morgan Kaufinann, 1996. 

Lxt S be a set of transactions which modify data d on a 
system containing one or more transaction managers. S is 
performed consistently if: 

(1) for any request rl not in S which accesses all or part 
of d, all parts of d accessed by rl are either in a state before 
modification by any transaction in S or in a state after 
modification by all transactions in S. 

(2) For any requests rl and r2 not in S where r2 is received 
by the system either at the same time as rl or after rl and 
both rl and r2 access a subset d' of d, 
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(a) if the version of it accessed by rl has been modified request from S has multiple members. As depicted, in step 
by transactions in S, then the version of d; accessed by 4600, each member of C denoted cache ragr i determines the 
r2 has also been modified by transactions in S. time at which it acquired the last lock in step 4505, last_ 

(b) if the version of d' accessed by r2 has not been lock_time_i; cache mgr i then sends last_lock__lime_i to . 
modified by transactions in S, then the version of d' S a program known as a coordinator program. In step 4610, the 
accessed by rl has also not been modified by transac- coordinator program receives lasl_lock_ame_i values 
tions in S. horn all cache managers in C and sets last _Jock_time to the 

A Umestamp is an attribute which can be assigned to ^^^^^^ last_lock_time_i value it receives. In step 4615, the 

events such as a transaction being received by a system or coordinator program sends lastilockjtime to all cache man- 

a lock being acquired Common methods for implementing ao *S®^ ^^.^^ i , . r-,^ .^^ 

^ ^^i^tu^^^' -* • 1^11*-. J L A vanation on the example depicted m FIG. 42 would be 

time stamps m the prior art include clock tunes and numbers r . . - ■ tf i r i . i i 

, . t f / tor each cache mgr i in C to exchange values of last lock 

which order events. ^. • -^u *u l ■ ^ ■ . A^ty^ . ~T 
r,. ... . time_i with other cache managers m C in step 4600 instead 

Another feature of the present mvention ,s the abiUtyto „f .^^drng last_lock_time_i to a coordinator program. In 

make a set of consistent updates to one or more caches. The ^(^^^ ^^^^ mgr i in C would determine last_ 

present mvenUon is of «se for a set of requests S to one or 15 iock_time from the last_lock_time_i values it receives. . 

more cache managers 3001 where the foUowmg properUes step 4615 would not be necessary. The preferred embodi- 

^"^z ment requires less communication and fewer comparisons 

(1) For any program accessing the system S must be ^^en C is large and is thus more scaleable than the variation 
made atomically. That is, p cannot have a view of the system jyg( described 

where some requests in Shave been satisfied and others have 20 One skilled in the art could easily adopt the present 

""I" , _ invention to achieve consistency in other systems containing 

(2) For any two requests rl and r2 received by appropriate more transaction managers wherein the transaction 
cache managers 3001 at the same time, rl and r2 see the managers do not have to be cache managers. 

same view of the system with respect to S. That is, either ^ow that the invention has been described by way of a 

both rl and r2 see a view of the system before requests in S 25 detailed description, with alternatives, various 

have been satisfied, or both rl and r2 see a view of the enhancements, variations, and equivalents will become 

system after requests m Shave been sausfied. apparent to those ofskill in the art. Thus it is understood that 

(3) For any two requests rl and t2 where r2 is received by detailed description has been provided by way of 
a cache manager 3001 after rl is received by a cache „ and not as a limitation. The proper scope of the 
manager, if rl has a view of the system after requests in S 30 i„ve„tion is properly defined by the claims. 

have been satisfied, then r2 must see the same view of the \i/bn is claimed is- 

system. If r2 sew a view of the system before requests in S system comprising a set of one or more transaction 

^"^^"".^ ^""^ ' '"'i* managers, a method for consistenUy performing a set S of 

FIG. 41 depicts an example of logic for making a set S of jj^^^ state-changing transactions which modify state 

requests consistently to a system including one or more 35 ^^^^^^ by a set T of one or more transaction managers 

caches. Preferably, each request m S is directed to one cache comprising the steps of- 

manager 3001. The set of cache managers C receiving a ^ j^^^ „f „^ ^^j^ ^ l^j^^^ 

request &om S may have one or inore member. data which prevent transactions not in S from one of (i) 

As depicted, m step 4500, the set of requests S is received modifying data accessed by a transaction in S and (ii) 

by the system. Each request is directed to a specific cache 40 ^^^j^ ^^^^ ^ a transaction in S; 

manager 3001. ^ - i_, , j . t. 

In step 4505, the cache managers lock data. For each (''> "'""fg " ^^^^^ ^^^^^ compnsmg one or 

, . *r_o*i. L more transaction requests which cannot be completed 

cache manager i receivmg a request from S, the cache , r.i j-./x 

managerjacquireswritelocksfordatamodifiedbyarequest because of locks acquired m step (a); 

in S and read locks for data read but by a request in S but 45 ^determining a timestamp at which a last lock Oast_ 

not written by a request in S. Data locked in this step will locl^time) was obtained in step (a) from the pluraUty 

subsequently be referred to as locked data. locks; 

In step 4600, the system determines the time the last lock (*^) enabling transactions in B, which could not be com- 

was acquired, last_lock_time. If the set of cache managers Pl^^ed in step (b) and were received before the last_ 

C receiving a request fi-om S has only one member, this step 50 locle_time, to access locked data before transactions in 

can easily be implemented using prior art. If C has multiple ^ access the locked data; 

members, last_Jock_time is determined in the manner (e) enabling transactiotis in S to access the locked data 

described in FIG. 42. before enabling transactions in B received after last_ 

In step 4510, requests received before last_lock_time lock_time to access the locked data; and 
which are waiting on locked data are performed. In step 55 (f) enabling transactions in B received after the last_ 
4520, requests in S are performed. In step 4530, locks are lock_time to access the locked data after transactions 
removed from locked data which allows requests received in S have accessed the locked data, 
after last_lock_time which are waiting on locked data to be 2. The method of claim 1 wherein said T includes a 
performed. Steps 4510,4520, and 4530 must be performed in plurality of transaction managers and step (c) fiu-ther corn- 
order. 60 prises the step of: for a subset T of said T including a 

An alternative embodiment to that depicted in FIG. 41 is plurality of transaction managers, each member ti of said T 

to use a single lock to prevent requests from accessing data determining a timestamp for the last lock (last_locK_time__ 

accessed by a request in S. The preferred embodiment i) it obtained in step (a). 

allows much higher levels of concurrence than this alterna- 3. The method of claim 2 further comprising the steps of: 

tive approach. 65 (g) a coordinator program receiving values of the last_ 

FIG. 42 depicts an example of logic for determining a lock_time_i from the plurality of transaction manag- 

last_Jock_time if the set of cache managers C receiving a ers ti in said T; 
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(h) the coordinator program determining the last_lock_ 
time from the values received in step (g); and 

(i) the coordinator program sending a value of last_lock_ 
time determined in step (h) to one or more transaction 
managers in said V, 

4. The method of claim 2 further comprising the steps of; 
(j) a transaction manager ti receiving last__lock_time_j 

values from one or more other transaction managers tj; 
and 

(k) the transaction manager ti determining the last_Jock„ 
time from the values received in step (j), 

5. The method as recited in claim 1, wherein at least part 
of the data which is locked is stored in at least one cache. 

6. The method as recited in claim 5, wherein at least one 
of the transaction managers includes at least one cache 
manager. 

7. A program storage device readable by machine, tangi- 
bly embodying a program of instructions executable by 
machine to perform method steps for consistently perform- 
ing a set S of one or more state -changing transactions which 
modify state managed by a set T of one or more transaction 
managers, according to any of claims 1. 

8. In a system comprising a set of at least one transaction 
manager, a method for consistently performing a set S of at 
least one state-changing transactions which modify state 
managed by a set T of at least one transaction manager 
comprising the steps of: 

(a) acquiring a plurality of locks on data known as locked 
data which prevent transactions outside of S from one 
of (i) modifying data accessed by a transaction in S and 
(ii) reading data modified by a transaction in S; 

(b) storing a blocked request set B comprising at least one 
transaction request which cannot be completed because 
of locks acquired in step (a); 

(c) determining a timestamp at which a last lock (last_ 
lock^time) was obtained in step (a) from the plurality 
of locks; and 

(d) enabling transactions in B, which could not be com- 
pleted in step (b) and were received before the last__ 
lock_time, to access locked data before transactions in 
S access the locked data. 

9. The method as recited in claim 8, further comprising 
the step of enabhng transactions in S to access the locked 
data before enabling transactions in B received after the 
last_Iock_time to access the locked data. 
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10. The method as recited in claim 9, further comprising 
the steps of enabling transactions in B received after the 
last_lock__time to access the locked data after transactions 
in S have accessed the locked data. 

11. The method as recited in claim 8, further comprising 
the steps of: 

sending the last_lock_time to a program coordinator 
which evaluates last lock times and determines a latest 
lock time from the last lock times sent to the program 
coordinator from each transaction manager; 

sending the latest lock time to the transaction managers; 
and 

performing requests received prior to the latest lock time 
wherein requests for reading and modifying the data 
which is locked see a same view after all requests prior 
to the latest lock time are performed. 

12. The method as recited in claim 8, wherein the lasl_ 
lock^time is provided by a timestamp. 

13. The method as recited in claim 8, wherein at least a 
portion of the data which is lodced is stored in at least one 
cache. 

14. The method as recited in claim 8, wherein at least one 
of the transaction managers includes at least one cache 
manager. 

15. The method as recited in claim 8, further comprising 
the steps of: 

exchanging last_lock_times of each of the transaction 
managers with other transaction managers to evaluate 
the last_lock_times and determine a latest lock time 

from the last lock__tim6s from all the transaction 

managers; 

sending the latest lock time to other transaction managers; 
and 

performing requests received prior to the latest lock time 
wherein requests for reading and modifying the data 
which is locked see a same view after all requests prior 
to the latest lock time are performed. 

16. A program storage device readable by machine, tan- 
gibly embodying a program of instructions executable by 
machine to perform method steps for managing locks to 
maintain consistency in a system performing transactions, 
according to any of claims 8-15, 
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